Method for characterizing polypeptides

ABSTRACT

Provided is a method for characterising a polypeptide, which method comprises the steps of; (a) optionnally reducing cysteine disulphide bridges in the polypeptide to form free thiols, and capping the free thiols; (b) cleaving the polypeptide with a sequence specific cleavage reagent to form peptide fragments; (c) optionally deactivating the cleavage reagent; (d) capping one or more ε-amino groups that are present with a lysine reactive agent; (e) analysing peptide fragments by mass spectrometry to form a mass fingerprint for the polypeptide; and (f) determining the identity of the polypeptide from the mass fingerprint.

FIELD OF THE INVENTION

This invention relates to methods of determining a mass fingerprint fromdigests of polypeptides. The invention in particular relates to the useof labels to improve mass fingerprints. This invention further relatesto the use of the above methods in determining the expression ofproteins in a tissue, cell type, or sub-cellular compartment or inanalysing large protein complexes.

BACKGROUND TO THE ART

The identification of proteins in biological samples is an essentialactivity of biochemical analysis, particularly the determination of thesequence of a protein, since the sequence determines the structure of aprotein, which, in turn, determines the function of the protein.Traditional techniques for protein identification are cumbersome andrelatively slow. The mainstay of protein identification techniques hasbeen chemical sequencing of peptides using the Edman degradation, whichcan sequentially identify amino acids in a peptide from the N-terminus.This sequencing technique is typically used in conjunction withenzymatic digestion of a protein or polypeptide. Typically, anunidentified polypeptide is digested and its component peptides areseparated from each other by chromatography. The individual peptides arethen subjected to Edman degradation. The sequences of the peptides canbe ordered by comparing the sequences of peptides from digestion of thepolypeptide with different sequence specific cleavage reagents. Thisprocess allows the complete sequence of a polypeptide to be determined.While this has been a highly successful technique for the identificationof proteins, it is quite laborious. New technologies have made rapidprotein identification more feasible such as Matrix Assisted LaserDesorption lonisation Time-of-Flight (MALDI-TOF) mass spectrometry. Thistechnique has permitted the development of peptide mass fingerprintingas a relatively rapid procedure for protein identification.

A typical peptide mass fingerprinting protocol involves determining themass of the unidentified protein followed by digestion of the proteinwith trypsin. Trypsin cleaves polypeptides selectively at arginine andlysine residues, leaving either arginine or lysine at the C-temmini ofthe product peptides. The positions of lysine and arginine in thesequence of a polypeptide determine where the polypeptide is cut givingrise to a characteristic series of peptides. The pattern of peptides canbe easily detected by MALDI-TOF mass spectrometry. This massspectrometric technique has a large mass range, can readily ionise largebiomolecules, will preferentially produce singly charged ions andcompetition for ionisation with this technique is not severe, althoughcompetition can be problematic. This means that there is generally onepeak in the mass spectrum for each peptide, the mass-to-charge ratio foreach peak has essentially the same value as the mass of the peptide,with an added proton to ionise the peptide, and most (and sometimes all)the peptides from the tryptic digest of an unidentified protein can beanalysed simultaneously. In effect the mass spectrum is a ‘bar-code’ inwhich the lines in the spectrum represent the masses of thecharacteristic cleavage peptides of the protein. For any given protein,there may be some peptides, which have the same mass as a peptide fromanother protein but it is very unlikely that two different proteins willgive rise to peptides that all have identical masses. This means thatthe pattern of masses of the tryptic digest of a protein is a fairlyunique identifier of that protein and is called a Peptide MassFingerprint (PMF). The relative uniqueness of PMF means that databasesof predicted PMFs, determined from known protein sequences or sequencesthat have been predicted from genomic DNA or expressed sequence tags(ESTs), can be used to identify proteins in biological samples (Pappin DJ C, Höjrup P and Bleasby A J, Current Biology 3: 327-332, “Rapididentification of proteins by peptide-mass fingerprinting.” 1993; MannM, Hojrup P, Roepstorff P. Biol Mass Spectrom 22(6): 338-345, “Use ofmass spectrometric molecular weight information to identify proteins insequence databases.” 1993; Yates J R 3rd, Speicher S, Griffin P R,Hunkapiller T, Anal Biochem 214(2): 397-408, “Peptide mass maps: ahighly informative approach to protein identification.” 1993). The PMFfor an unknown protein can be compared with all of the PMFs in adatabase to find the best match, thereby identifying the protein.Searches of this kind can be constrained by determining the mass of theprotein prior to digestion. In this way the pattern of masses of anunidentified polypeptide can be related to its sequence, which in turncan help to determine the role of a protein in a particular sample.

There are, however, many technical difficulties involved in determiningthe PMF for a protein. A typical protein will give rise to twenty tothirty peptides after cleavage with trypsin, but not all of thesepeptides will appear in the mass spectrum. The precise reasons for thisare not fully understood. One factor that is believed to causeincomplete spectra is competition for protonation during the ionisationprocess, resulting in preferential ionisation of arginine containingpeptides (Krause E. & Wenschuh H. & Jungblut P. R., Anal Chem. 71(19):4160-4165, “The dominance of arginine-containing peptides inMALDI-derived tryptic mass fingerprints of proteins.” 1999). Inaddition, there are surface effects that result from the process ofpreparing MALDI targets. The targets are prepared by dissolving thepeptide digest in a saturated solution of the matrix material. Smalldroplets of the peptide/matrix solution are dropped onto a metal targetand left to dry. Differences in solubility of peptides will mean thatsome peptides will preferentially crystallise near the top surface ofthe matrix where they will be desorbed more readily.

Sensitivity is also a problem with conventional protocols foridentifying proteins from their PMF. To be an effective tool, it shouldbe possible to determine a PMF for as small a sample of protein aspossible to improve the dynamic range of the analysis of proteinsamples.

Some attempts have been made to improve the ionisation of peptides thatdo not contain arginine. Conversion of lysine to homo-arginine is oneapproach that has met with some success (V. Bonetto et al., Journal ofProtein Chemistry 16(5): 371 -374, “C-terminal Sequence Determination ofModified Peptides by MALDI MS”, 1997: Brancia et 3l.: Electrophoresis22: 552-559, “A combination of chemical derivitisation and improvedbioinformatics tools optimises protein identification for proteomics”,2001). The conversion of lysine to homo-arginine introduces guanidinofunctionalities into all of the peptides from a tryptic digest, with theexception of C-terminal peptides, greatly improving the representationof lysine containing peptides in the MALDI-TOF mass spectra.

Conventional techniques for determining the expression of proteins inbiological samples depend on protein identification. The goal of proteinexpression profiling is to identify as many proteins in a sample aspossible and, preferably, to determine the quantity, of the protein inthe sample. A typical method of profiling a population of proteins is bytwo-dimensional electrophoresis (R. A. Van Bogelen., E. R. Olson,“Application of two-dimensional protein gels in biotecnology.”,Biotechnol Annu Rev, 1:69-103, 1995). In this method a protein sampleextracted from a biological sample is separated by two independentelectrophoretic procedures. This first separation usually separatesproteins on the basis of their iso-electric point using a gel-filledcapillary or gel strip along which a pH gradient exists. Proteinsmigrate electrophoretically along the gradient until the pH is such thatthe protein has no net charge, referred to as the iso-electric point,from which the protein can migrate no further. After all of the proteinsin the sample have reached their iso-electric point, the proteins areseparated further using a second electrophoretic procedure. To performthe second procedure, the entire iso-electric focussing gel strip isthen laid against one edge of a rectangular gel. The separated proteinsin the strip are then electrophoretically separated in the second gel onthe basis of their size. The proteins are thus resolved into a2-dimensional array of spots in a rectangular slab of acrylamide.However, after separating the proteins in a sample from each other,there remains the problem of detecting and then identifying theproteins. The currently favoured approach to identify proteins is toanalyse the protein in specific spots on the gel by peptide massfingerprinting using MALDI-TOF mass spectrometry (Jungblut P, Thiede B.“Protein identification from 2-DE gels MALDI mass spectrometry.” MassSpectrom Rev. 16:145-162, 1997). 2-DE technology is therefore limited bythe detection capabilities of the peptide mass fingerprinting methodsused in the identification of proteins in gel spots. The existingtechnology cannot easily compare the expression levels of two or moresamples and there are sensitivity problems with such a complex processdue to sample losses during the separation of the proteins and theirsubsequent recovery from the 2-D gel. In addition, proteins extractedfrom a 2-D gel are generally in buffers containing solutes that areincompatible with mass spectrometric analysis.

It is an aim of this invention to solve the problems associated with theknown methods described above. It is thus an aim of this invention toprovide improved methods for producing peptide mass fingerprints, usinglabels (tags). It is a further aim of this invention to provide methodsto determine peptide mass fingerprints using protein reactive reagentsthat are stable in water, selective for lysine and that work under mildreaction conditions without degradation of the reagents.

DETAILED DESCRIPTION OF THE INVENTION

Accordingly, the present invention provides a method for characterisinga polypeptide which method comprises the steps of:

-   -   (a) optionally reducing cysteine disulphide bridges in the        polypeptide to form free thiols, and capping the resulting free        thiols;    -   (b) cleaving the polypeptide with a sequence specific cleavage        reagent to form peptide fragments;    -   (c) optionally deactivating the cleavage reagent;    -   (d) capping one or more e-amino groups that are present with a        lysine reactive agent, preferably a labelled lysine-reactive        agent;    -   (e) analysing peptide fragments by mass spectrometry to form a        mass fingerprint for the polypeptide; and

(f) determining the identity of the polypeptide from the massfingerprint.

The order of the steps as listed above is not intended to represent theorder in which the steps must be carried out, and the skilled personwill appreciate that the order of some of the steps can be interchangedif desired. Thus, although one preferred order of the non-optional stepsis (b), (d), (e) and then (f), another possible order is (d), (b), (e)and then (f). Thus, capping step (d) can be carried out before cleavingor after cleaving. For both of these orders, reducing step (a) can becarried out at any time provided that it comes prior to the capping step(d). Also for both of these orders deactivating step (c) can be carriedout at any time, provided that it comes after the cleaving step (b), butpreferably it is carried out directly after cleaving step (b).

It will be appreciated from the above that this method specificallyrelates to identifying an unknown polypeptide which may be alreadyisolated or may be present in a sample comprising a population ofpolypeptides.

The present invention also provides a method for characterising apopulation of polypeptides, which method comprises the steps of:

-   -   (a) optionally reducing cysteine disulphide bridges in one or        more polypeptides to form free thiols, and capping the resulting        free thiols;    -   (b) separating one or more polypeptides from the population;    -   (c) cleaving one or more polypeptides with a sequence specific        cleavage reagent to form peptide fragments;    -   (d) optionally deactivating the cleavage reagent;    -   (e) capping one or more ε-amino groups that are present with a        lysine reactive agent, preferably a labelled lysine-reactive        agent;    -   (f) analysing the peptide fragments by mass spectrometry to form        a mass fingerprint for one or more of the polypeptides; and    -   (g) determining the identity of one or more polypeptides from        the mass fingerprint.

The order of the steps as listed above is again not intended torepresent the order in which the steps must be carried out, and theskilled person will appreciate that the order of some of the steps canbe interchanged if desired. Thus although one preferred order of thenon- optional steps is (b), (c), (e), (f) and then (g), other possibleorders are (b), (e), (c), (i) and then (g), and also (e), (b), (c), (f)and then (g). Thus, capping step (e) can be carried out beforeseparating and cleaving, after separating and cleaving or even betweenseparating and cleaving. For all of these orders, separating step (b)must be carried out prior to cleaving step (c). Also for all of theseorders, reducing step (a) can be carried out at any time provided thatit comes prior to the capping step (e). Again for all of these ordersdeactivating step (d) can be carried out at any time, provided that itcomes after the cleaving step (c), but preferably it is carried outdirectly after cleaving step (c).

It will be appreciated that this method allows the identification of aplurality of polypeptides in a sample and may be employed to determinethe full expression profile of a sample, if desired. Alternatively, thismethod may be employed to assay for a known polypeptide in a samplewhose composition is not known. In these aspects the peptide massfingerprints of the polypeptides in the sample are determined andcompared with the peptide mass fingerprint for the known polypeptide orpolypeptides to see which ones are present, and preferably to see inwhat quantity they is present.

The present invention also provides a method for comparing a pluralityof samples, each sample comprising one or more polypeptides, whichmethod comprises the steps of:

-   -   (a) optionally reducing cysteine disulphide bridges and capping        the resulting free thiols in one or more polypeptides from the        samples;    -   (b) separating one or more polypeptides from each of the        samples;    -   (c) cleaving the polypeptides with a sequence specific cleavage        reagent to form peptide fragments;    -   (d) optionally deactivating the cleavage reagent;    -   (e) capping one or more i-amino groups that are present with a        lysine reactive agent, preferably a labelled lysine-reactive        agent;    -   (f) analysing the peptide fragments by mass spectrometry to form        a mass fingerprint for one or more polypeptides in the samples;        and    -   (g) determining the identity of one or more polypeptides in the        samples from one or more mass fingerprints.

The order of the steps as listed above is not intended to represent theorder in which the steps must be carried out, and the skilled personwill appreciate that the order of some of the steps can be interchangedif desired. Thus although one preferred order of the non- optional stepsis (b), (c), (e), (f) and then (g), other possible orders are (b), (e),(c), (f) and then (g), and also (e), (b), (c), (f) and then (g). Thus,capping step (e) can be carried out before separating and cleaving,after separating and cleaving or even between separating and cleaving.For all of these orders, separating step (b) must be carried out priorto cleaving step (c). Also for all of these orders, reducing step (a)can be carried out at any time provided that it comes prior to thecapping step (e). Again for all of these orders deactivating step (d)can be carried out at any time, provided that it comes after thecleaving step (c), but preferably it is carried out directly aftercleaving step (c).

In this embodiment of the invention, it is preferred that at some stagein the method the samples are pooled to make processing more efficient.If the samples are pooled,. they can be resolved by ensuring that thesame label is employed for polypeptides or peptides from the samesample, and different labels are employed for polypeptides or peptidesfrom different samples, such that the sample from which a polypeptide orpeptide originates can be determined from its label. The labels arepreferably introduced in the capping step and are thus preferablyattached to the lysine-reactive agent. The pooling step can take placeat any time, provided that the samples are individually labelled, asdiscussed above. Thus, if the labels are introduced during the cappingstep, pooling must take place after capping to ensure that the samplesdo not become mixed before the labels have been introduced. Preferablythe samples are pooled before the individual proteins are separated sothat all the proteins in all the samples are separated at the same timein the same step. This is particularly efficient.

In some cases a particular protein will be present in more than onesample. These proteins will clearly have the same mass fingerprint. Ifthe proteins are not separated, these mass fingerprints will be overlaidafter performing mass spectrometry on the cleavage products. However,each fingerprint can be resolved due to the presence of the labels.Therefore, since the identity of the sample from which the protein comescan be resolved, it can be advantageous when comparing two or moresamples to identify the same proteins together in the same spectrum tocompare their expression levels. Thus, in some embodiments it ispreferred that the same proteins from different samples do not becomeseparated. This can be achieved by ensuring that the different labelsused for each sample all have the same mass. Labels of this type thatcan be used in this invention are described in PCT/GB01/01122.

As will be clear from the above-mentioned order of the method steps,generally it is preferred that the polypeptides in a sample areseparated before cleavage occurs, since identical fragments may beproduced from different polypeptides, which may prevent resolution ofthe different mass fingerprints in some cases. The cleaving steppreferably takes place after separating to avoid fragments from onepolypeptide becoming mixed with fragments from other polypeptides. Thisparticularly applies to methods involving a number of samples, sincethese sample can be more conveniently labelled in the capping step priorto any separation.

The present methods have the advantage of improved sensitivity and canincrease the number of peptides that are detected from a protein. Inaddition, through the use of appropriate tags, it is possible with thisinvention to analyse multiple samples simultaneously and it is alsopossible to determine the ratios of corresponding peptides in thedifferent samples. With appropriate labelling procedures, it is alsopossible to facilitate the conditioning of polypeptide samples fordetection by mass spectrometry.

The steps (b) and (d) of the method of the present invention can becarried out in any order, provided that the peptide fragments can beisolated. Thus, in some embodiments the peptides can be cleaved prior tocapping, or in other embodiments, the residues can be capped whilststill forming part of a polypeptide, which polypeptide is subsequentlycleaved. In the latter embodiments, the cleavage reagent is preferablycapable of cleaving on the C-terminal side of lysine residues even afterthese residues have been capped.

The peptide fragments comprising capped ε-amino groups are preferablyremoved by capturing these fragments, e.g. on a solid phase. In thisembodiment, the lysine reactive agent is a lysine selective captureagent. Selective capture may be achieved by attaching a capture group tothe lysine reactive agent (such as biotin), which ensures that the agentalong with its capped peptide fragment attaches to a solid phase (suchas an avidinated solid phase) after capping has occurred. In analternative embodiment, the lysine reactive agent may be attached to asolid phase before the capping takes place, so that the peptidefragments are captured onto the solid phase by the capping reactionitself.

The capped fragments can thus be removed from the sample by separatingthe sample from the solid phase, leaving the capped fragments separatefrom the sample on the solid phase. These fragments may then be analysedto determine the polypeptides present in the original sample.

The method of the invention allows lower concentrations of the reagentsto be used at higher pH. Both of these factors have been found by theinventors to improve the selectivity and completeness of lysinereactions. In the following description, lysine amino groups will bereferred to as epsilon amino (ε-amino) groups. The lysine reactive agentis preferably a hindered Michael reagent. A Michael reagent has ageneral formula as below:

In the above formula, X is an electron withdrawing group that is capableof stabilising a negative charge. The functional group —X is preferablyselected from those listed in Table 1 below: TABLE 1 Functional GroupStructure Aldehyde

Amide

Ester

Ketone

Nitrile —C≡N Pyridine ring

Sulphone

Where R¹ may be any alkyl or aromatic group but is preferably anelectron withdrawing group and more preferably a cyclic or heterocylicaromatic ring or fused ring. Preferably the ring structure is electronwithdrawing. More specifically R¹ is preferably a small ring or fusedring such as a phenyl, pyridyl, naphthyl or quinolyl ring structure.Preferred ring structures are substituted with appropriate electronwithdrawing groups such as halogens like fluorine or nitro groups.Preferred ring structures promote water solubility, such as pyridyl andnaphthyl rings. If —X is an amide, then one or both of the R¹ groups maybe a hydrogen atom. If —X is a nitrile, preferred compounds includecrotonitriles such as trifluorocrotonitrile. R¹ may additionallycomprise a linker to an affinity capture functionality, such as biotin,or a linker to a solid phase support. In the formula above R² is eithera hydrogen atom or it may comprise an electron-withdrawing group and/ora linker to an affinity capture functionality or a linker to a solidphase support. Further specific groups that R² may be are listed belowin the definition of the group Sub.

To be a ‘hindered’ Michael reagent according to this invention, at leastone of die R groups is not hydrogen and is considered to be a stericallyhindering group. At least one R group may comprise an alkyl or aromaticgroup such as a methyl or phenyl group. More preferably at least one ofthe R groups is electron-withdrawing and may comprise a halogen atom ora halogenated alkyl group, such as fluoromethyl, difluoromethyl ortrifluoromethyl group or a phenyl ring with electron withdrawingsubstituents such as halogen or nitro groups. In addition, one R-groupmay comprise a linker to an affinity capture functionality, such asbiotin, or a linker to a solid phase support. Conversely to be an‘unhindered’ Michael reagent in the context of this invention, both Rgroups would be hydrogen.

In a preferred embodiment, one (and more preferably only one) of the X—,R—, R¹— and R²— groups comprises a linker to an affinity capturefunctionality, such as biotin, or a linker to a solid phase support.

In some embodiments, the X group may be joined to one of the R groups toform a ring. Preferred compounds of this type include maleimides of theformula:

Where R has the same meaning as above and R¹ is a hydrocarbon group oran electron donating group. Preferably R comprises an alkyl group oraryl group and particularly preferably R comprises a C₁-C₆ alkyl group,such as a methyl or ethyl group.

The group Sub in the above formulae is not particularly limited,provided that the Michael agent is capable of reacting with an ε-aminogroup. The group is generally a group R² as defined above, and morespecifically in preferred embodiments of the invention, Sub comprises ahydrocarbon group such as an alkyl or aryl group or an electronwithdrawing group, such as a cyano group (—CN), or a halogen (F, Cl, Br,I) or halogen-containing group. In the most preferred embodiments, Subcomprises a hydrogen, or a Cl-C₆ alkyl group, such as a methyl or ethylgroup. A particularly preferred compound is one in which Sub and R areboth H and R′ comprises a methyl group or an ethyl group.

In the context of this invention, the term lysine-selective reagentrefers to the ability of the reagent to discriminate between theepsilon-amino group of lysine and the alpha-amino groups of all aminoacids. It is also preferred that the reagents of this invention do notreact with other side chain functionalites such as the imidazole ring ofhistidine, the guanidino group of arginine and hydroxyl functionalitiesfound in serine, threonine and tyrosine.

In the context of this invention, the term capture reagent refers to theability of the reagent to capture molecules onto a solid support. Thus,as mentioned above, the capture reagent may comprise a reactivefunctionality linked covalently to a solid phase support, or it maycomprise a reactive functionality linked to functionality that can bechemically linked to a solid phase support or it may comprise a reactivefunctionality linked to an affinity capture functionality, which can becaptured to a solid support by interaction with a specific ligand thatis linked to the solid support.

The various aspects of this invention will now be discussed in moredetail below.

In one embodiment of this invention there is provided a method ofdetermining a mass fingerprint for a polypeptide comprising the stepsof:

-   -   1. Digesting the polypeptide completely with a sequence specific        cleavage reagent.    -   2. Reacting the polypeptide with a lysine reactive hindered        Michael reagent so that all available epsilon-amino groups in        the polypeptide are capped with the reagent and preferably only        one molecule of the alkylating Michael reagent reacts with each        epsilon-amine available in the polypeptide.    -   3. Analysing the labelled peptides from the digested polypeptide        by mass spectrometry.

In this and other embodiments of the present invention, a furtheroptional step may also be carried out in case disulphide linkages arepresent This step involves reducing disulphide linkages in thepolypeptides, and capping resultant free thiols (and/or free thiolsinitially present) in the polypeptides. If desired, this step may becarried out prior to digesting the sample with the cleavage agent, e.g.:

-   -   1. Optionally reducing cysteine disulphide bridges and capping        of free thiols.    -   2. Digesting the polypeptide completely with a sequence specific        cleavage reagent.    -   3. Reacting the polypeptide with a lysine reactive hindered        Michael reagent so that all available epsilon-amino groups in        the polypeptide are capped with the reagent and preferably only        one molecule of the alkylating Michael reagent reacts with each        epsilon-amine available in the polypeptide.    -   4. Analysing the labelled peptides from the digested polypeptide        by mass spectrometry.

In a further aspect, this invention provides a method for determiningthe expression profile of a sample, which method comprisescharacterising a plurality of polypeptides from one or more mixtures ofpolypeptides according to a methods defined above. Thus, this aspect ofthe invention provides a method of determining the expression profile ofat least one mixture of polypeptides and is a method to identify andpreferably also to quantify each polypeptide in the mixture.

In preferred embodiments of this invention the sequence specificcleavage reagent is Trypsin or Lys-C.

In preferred embodiments of this invention the lysine reactive tagcomprises a sensitivity enhancing group. This sensitivity enhancinggroup improves the ionisation efficiency of the tagged peptides.Preferred sensitivity enhancing groups include non-fluorescent dyes suchas cinnamic acid derivatives, tertiary amino groups, guanidino groups,quaternary ammonium groups or pyridinium groups.

In some embodiments of the invention, the lysine reactive tag maycomprise an affinity capture agent such as biotin.

In a yet farther aspect, this invention provides a lysine selectiveprotein labelling reagent that comprises a thiol and amino reactivehindered alkenyl sulphone compounds with the formula:

Where R¹ may be any alkyl or aromatic group but is preferably anelectron withdrawing group and more preferably a cyclic or heterocylicaromatic ring or fused ring. Preferably the ring structure is electronwithdrawing. More specifically R¹ is preferably a small ring or fusedring such as a phenyl, pyridyl, naphthyl or quinolyl ring structure.Preferred ring structures are substituted with appropriate electronwithdrawing groups such as halogens like fluorine or nitro groups.Preferred ring structures promote water solubility, such as pyridyl andnaphthyl rings. R¹ may additionally comprise a linker to an affinitycapture functionality, such as biotin, or a linker to a solid phasesupport.

In the formula above R² is most preferably a hydrogen atom, but it mayalternatively comprise an electron-withdrawing group and/or a linker toan affinity capture functionality or a linker to a solid phase support.

To be a ‘hindered’ Michael reagent according to this invention, at leastone of the R groups is not hydrogen and is considered to be a stericallyhindering group. At least one R group may comprise an alkyl or aromaticgroup such as a methyl or phenyl group. More preferably at least one ofthe R groups is electron-withdrawing and may comprise a halogen atom ora halogenated alkyl group, such as fluoromethyl, difluoromethyl ortrifluoromethyl group or a phenyl ring with electron withdrawingsubstituents such as halogen or nitro groups. In addition, one R-groupmay comprise a linker to an affinity capture functionality, such asbiotin, or a linker to a solid phase support. Conversely to be an‘unhindered’ Michael reagent in the context of this invention, both Rgroups would be hydrogen. Preferably one and more preferably, only oneof the R—, R¹ and R² groups comprises a linker to an affinity capturefunctionality, such as biotin, or a linker to a solid phase support.

In a still further aspect, this invention provides a method of comparingthe expression levels of polypeptides in two or more biological samplesthat comprise a mixture of polypeptides by determining a massfingerprint for the polypeptides. The preferred method comprises thefollowing steps:

-   -   1. For each sample of polypeptides, optionally reducing cysteine        disulphide bridges and capping of free thiols in all of the        polypeptides;    -   2. Reacting each sample of polypeptides with a lysine reactive        hindered Michael reagent so that all available epsilon-amino        groups in the polypeptides are capped with the reagent and        preferably only one molecule of the alkylating Michael reagent        reacts with each epsilon-amine available in the polypeptides.        Each sample is labelled with a different tag from every other        sample, where the differences between the tags are resolvable by        mass spectrometry.    -   3. Pooling the labelled samples    -   4. Separating the component polypeptides of the pooled samples        so that each different polypeptide may be isolated.    -   5. Digesting each polypeptide completely with a sequence        specific cleavage reagent.    -   6. Analysing the labelled peptides from the digested polypeptide        by mass spectrometry.

The invention will now be described in more detail by way of exampleonly, with reference to the following Figures:

FIG. 1 shows a selection of preferred hindered alkenyl sulphone reagentsfor use with this invention. Synthetic procedures for the production ofsome of these reagents is described in the examples section;

FIG. 2 shows a schematic illustration of the first aspect of thisinvention in which a polypeptide is prepared for peptide massfingerprinting using Trypsin as the sequence specific cleavage reagent;

FIG. 3 shows a schematic illustration of the first aspect of thisinvention in which a polypeptide is prepared for peptide massfingerprinting using Lys-C as the sequence specific cleavage reagent;

FIG. 4 shows a schematic illustration of the first aspect of thisinvention in which a polypeptide is prepared for peptide massfingerprinting using Trypsin as the sequence specific cleavage reagentand a lysine-selective tag that comprises a biotin affinity tag;

FIG. 5 shows a schematic illustration of the first aspect of thisinvention in which a polypeptide is prepared for peptide massfingerprinting using Lys-C as the sequence specific cleavage reagent;

FIG. 6 shows a schematic illustration of the first aspect of thisinvention in which a polypeptide is prepared for peptide massfingerprinting using a lysine-selective tag that is reacted with thepolypeptide prior to cleavage using Trypsin as the sequence specificcleavage reagent;

FIG. 7 shows the mass spectrum of an example of a protocol for labellingboth the thiols and epsilon amino groups of a peptide. In this examplethe peptide is Calcitonin S and the thiols are labelled with a differenttag from the epsilon amino groups;

FIG. 8 shows the mass spectrum of an example of a protocol for labellingboth the thiols and epsilon amino groups of a peptide with the samelabel;

FIG. 9 shows the mass spectrum of an example of a protocol for labellingboth the thiols and epsilon amino groups of a mixture of peptides—inthis example the thiols are labelled with the same tag as the epsilonamino groups;

FIG. 10 shows the mass spectrum of an example of a protocol forlabelling the alpha-amino groups of a mixture of peptides where both thethiols and epsilon-amino groups of the peptides have already beenblocked with the same mass tag;

The lysine reactive (lysine selective) reagents used in the methods ofthe present invention will now be described in more detail.

Many amine selective protein reactive reagents are known in the art.These reagents will all have some degree of discrimination in favour ofreaction with lysine over alpha amino groups at high pH but not manyshow sufficient discrimination to allow lysine to be labelled almostexclusively. A number of lysine-selective reagents have been describedin the prior art and these are all appropriate for use with thisinvention, particularly cyclic anhydrides. Pyromellitic dianhydride ando-sulphobenzoic acid anhydride are reported to be lysine selectiveacylating reagents (Bagree et al., FEBS Lett. 120 (2):275-277, 1980).Similarly Phthalic anhydride, whose structure and reactivity is similarto pyromellitic anhydride would be expected to be lysine selective.Phthalic anhydride is reported to have few side-reactions with otheramino acids (Palacian E. et al., Mol Cell Biochem. 97 (2): 101-111,1990). More importantly, most reagents that react with lysine are notstable at high pH, particularly active esters such as carboxylic acidanhydride, N-hydroxysuccinimide esters and pentafluorophenyl esters.These reagents must be used in large excess exacerbating the lack ofselectivity of the reaction as a result of the excess.

Michael reagents have a number of properties that make them attractivefor protein reactions and have been used quite widely for this purpose(Friedman M. & Wall J. S., J Org Chem 31:2888 -2894, “Additive LinearFree-Energy Relationships in Reaction Kinetics of Amino Groups withalpha,beta-Unsaturated Compounds.” 1966; Morpurgo M. & Veronese F. M. &Kachensky D. & Harris J. M., Bioconjug Chem 7(3): 363 -368, “Preparationof characterization of poly(ethylene glycol) vinyl sulfone.” 1996;Friedman M. & Finley J. W., Int J Pept Protein Res 7(6): 481-486,“Reactions of proteins with ethyl vinyl sulfone.⇄ 1975; Masri M. S. &Friedman M., J Protein Chem 7(1): 49-54, “Protein reactions with methyland ethyl vinyl sulfones” 1988; Graham L. & Mechanic G. L., Anal Biochem153(2): 354-358, “[14C]acrylonitrile: preparation via a stable tosylateintermediate and quantitative reaction with amine residues in collagen.”1986; Esterbauer H. & Zollner H. & Scholz N., Z Natuforsch [C] 30(4):466-473, “Reaction of glutathione with conjugated carbonyls.”1975).

There is a number of these reagents that are relatively stable inaqueous solution and the structures of these compounds can be variedextensively to achieve different degrees of reactivity and selectivity.Other reagents used for protein labelling are often not very stable inwater and are less easily modified. In particular, reactions with aminesare often done with active esters, which are quite susceptible tohydrolysis. Reagents based on sulphones are generally more convenientand effective for labelling amino-groups than the more widely usedesters. Michael reagents that have been used with proteins includecompounds such as acrylonitrile, acrylamide, vinyl pyridine, methylvinyl sulphone and methyl vinyl ketone. The reaction of these compoundshave been compared (Friedman M. & Wall J. S from above) and linearrelationships between the reaction kinetics of these structurallysimilar compounds are observed. These linear relationships indicate thatthe reactions of this class of compounds take place by the samemechanism although their rates of reaction differ with the sulphone andketone compounds found to be by far the most reactive. The vinylcompounds, i.e. acrylonitrile, acrylamide, vinyl pyridine, methyl vinylsulphone and methyl vinyl ketone have broadly the same relative rates ofreaction with different substrates but differ from each other in theiroverall rates of reaction. These linear relationships make it reasonableto assume that the reactions of this class of compounds take place bythe same mechanism and that changes to substituents in this class ofcompounds, particularly at the beta position of the reactive doublebond, will produce similar changes in behaviour in the whole class ofcompounds. For example, it would be expected that the change in relativereaction rates of crotononitrile with a series of substrates whencompared with acrylonitrile would be essentially the same as the changein relative reaction rates of methyl propenyl sulphone with a series ofsubstrates when compared with methyl vinyl sulphone. This means that theproperties of methyl propenyl sulphone will be essentially the same ascrotononitrile except that the rate of reaction of the sulphone will befaster.

The choice of a Michael reagent for the purposes of this invention isdependent on a number of criteria, included rates of reaction, chancesof side-reactions apart from the Michael addition and ease of synthesisof different variants of the compound. Vinyl ketones can, for example,undergo other reactions besides Michael addition, particularlynuclceophilic attack of the ketone after Michael addition has takenplace. The ketone functionality can undergo this further reaction with avariety of nucleophiles, including the usual biological nucleophiles.Similarly, nitrile compounds can undergo hydrolysis of the nitrilefunctionality to the carboxylic acid, although typically this reactionwill not occur under the conditions used in most biological assays.Alkenyl sulphones do not undergo reactions other than the Michaeladdition under the conditions used in typical biological assays. Alkenylsulphones generally react rapidly with biological nucleophiles and thereis an extensive literature on the synthesis of different forms ofalkenyl sulphone. For these reasons alkenyl sulphones are preferredMichael Reagents for use in the biological assays of this invention.Maleimide compounds such as N-ethylmaleimide also react rapidly withproteins by Michael addition and are reasonably stable under theconditions used for labelling proteins, although alkaline hydrolysis isobserved when these reagents are polymer bound. Thus maleimide compoundsare also preferred Michael Reagents for use in the biological assays ofthis invention. In most circumstances nitrile reagents are alsopreferred reagents although a nitrile reagent will tend to react moreslowly than corresponding sulphones. Similarly acrylamides react stillmore slowly. These preferences do not mean that the other Michaelreagents available are unsuitable for this invention, but for mostpurposes rapid reaction of the reagents is preferred. Under appropriateconditions almost any of the Michael reagents could be used in themethods of this invention.

A preferred class of lysine-selective reagents for use in this inventioncomprise hindered alkenyl sulphones as the lysine selective reactivegroups. Combinations of these reagents under appropriate mild conditionscan allow a high degree of discrimination between alpha-amino groups andlysine epsilon-amino groups in amine-labelling reactions. Vinylsulphones are known to react readily with primary amines giving adi-alkylated product. The inventors have shown that these reagents willreact more rapidly with epsilon-amino groups at high pHs (>9.0) thanwith alpha-amino groups but the discrimination of these unhinderedsulphones is poor. More hindered alkenyl sulphones such as propenylsulphones and butenyl sulphones show a greatly enhanced discriminationin favour of epsilon amino groups when compared with the vinylsulphones. In addition, these hindered reagents produce themono-alkylated product almost exclusively. Moreover, lysineepsilon-amino groups that have- been mono-alkylated with some of themore hindered sulphones are resistant to further reaction with otheramine reactive reagents.

This discrimination by hindered sulphones means that epsilon-aminogroups can be selectively labelled in preference to alpha-amino groupsunder mild aqueous conditions with convenient, stable, water-solublereagents. If a lysine selective capture reagent is required the hinderedalkenyl sulphone functional groups of this invention can be linked to asolid support. Alternatively an affinity capture reagent can begenerated by linking the hindered alkenyl sulphone functional groups ofthis invention to biotin or digoxigenin, for example. As a furtheralternative the hindered alkenyl sulphone functionalities may becovalently linked to a second reactive functionality that is reactivewith an appropriately derivitised solid phase support. Boronic acid isunknown to selectively react with vicinal cis-diols and chemicallysimilar ligands, such as salicylhydroxamic acid. Reagents comprisingboronic acid have been developed for protein capture onto solid supportsderivitised with salicylhydroxamic acid (Stolowitz M. L.. et al.,Bioconjug Chem. 12 (2): 229-239, “Phenylboronic Acid-SalicylhydroxamicAcid Bioconjugates. 1. A Novel Boronic Acid Complex for ProteinImmobilization.” 2001; Wiley J. P. et al., Bioconjug. Chem. 12 (2):240-250, “Phenylboronic Acid-Salicylhydroxamic Acid Bioconjugates. 2.Polyvalent Immobilization of Protein Ligands for AffinityChromatography.” 2001, Prolinx, Inc, Washington State, USA). It isanticipated that it should be relatively simple to link a phenylboronicacid functionality to a hindered alkenyl sulphone functionality togenerate capture reagents that can be captured by selective chemicalreactions. The use of this sort of chemistry would not be directlycompatible with proteins bearing vicinal cis-diol-containing sugars,however these sorts of sugars could be blocked with phenylboronic acidor related reagents prior to reaction with boronic acid derivitisedlysine selective reagents. Solution phase capture reagents, that may becaptured onto solid supports, are advantageous as the lysine reactionmay take place in the solution phase, with a large excess of reagent todrive the reaction to completion quickly.

Numerous methods of synthesising hindered alkenyl sulphones are known inthe art. For general reviews of synthetic methods that have been usedfor the synthesis of alpha-,beta-unsaturated sulphones see Simpkins N.,Tetrahedron 46: 6951-6984, “The chemistry of vinyl sulphones”, 1990; andFuchs P. L. and Braish T. F., Chem Rev. 86: 903-917, “MultiplyConvergent Synthesis via Conjugate-Addition Reactions to CycloalkenylSulfones”, 1986.

Preferred hindered alkenyl sulphone compounds of this invention have theformula:

Where R¹ may be any alkyl or aromatic group but is preferably anelectron withdrawing group and more preferably a cyclic or heterocylicaromatic ring or fused ring. Preferably the ring structure is electronwithdrawing. More specifically R¹ is preferably a small ring or fusedring such as a phenyl, pyridyl, naphthyl or quinolyl ring structure.Preferred ring structures are substituted with appropriate electronwithdrawing groups such as halogens like fluorine or nitro groups.Preferred ring structures promote water solubility, such as pyridyl andnaphthyl rings. R¹ may additionally comprise a linker to an affinitycapture functionality, such as biotin, or a linker to a solid phasesupport.

In the formula above R² is either a hydrogen atom or it may comprise anelectron-withdrawing group and/or a linker to an affinity capturefunctionality or a linker to a solid phase support.

To be a ‘hindered’ Michael reagent according to this invention, at leastone of the R groups is not hydrogen and is considered to be a stericallyhindering group. At least one R group may comprise an alkyl or aromaticgroup such as a methyl or phenyl group.

More preferably at least one of the R groups is electron-withdrawing andmay comprise a halogen atom or a halogenated alkyl group, such asfluoromethyl, difluoromethyl or trifluoromethyl group or a phenyl ringwith electron withdrawing substituents such as halogen or nitro groups.In addition, one R-group may comprise a linker to an affinity capturefunctionality, such as biotin, or a linker to a solid phase support.Conversely to be an ‘unhindered’ Michael reagent in the context of thisinvention, both R groups would be hydrogen. One and preferably, only oneof the R—, R¹— and R² groups comprises a linker to an affinity capturefunctionality, such as biotin, or a linker to a solid phase support.

Various entry-points into the synthesis of alkenyl sulphones may becontemplated to produce compounds that are appropriately substituted foruse with this invention. Aldol condensation-type reactions can be used.Methyl phenyl sulphone can be reacted with a variety of ketones andaldehydes to give hindered alkenyl sulphones (see FIG. 1 and the reviewsabove). Appropriate ketones include acetone and hexafluoroacetone.Aldehydes include benzaldehyde, fluorobenzaldehyde,difluorobenzaldehyde, trifluoromethylbenzaldehyde and nitrobenzaldehyde.4-(Methylsulfonyl)benzoic acid provides a starting point for thesynthesis of a hindered sulphone that can be linked to a solid supportor to an affinity capture reagent through the benzoic acid.Amino-derivinsed polystyrene is available from various sources includingSigma-Aldrich, UK. Carbodiimide coupling of the functionalised benzoicacid to generate an amide linkage to the solid support would besufficient generate a solid support derivitised with the appropriatealkenyl sulphone. Various forms of amino-functionalised biotin areavailable from Pierce Chemical Company, IL, USA, which would allow abiotin compound derivitised with a variety of alkenyl sulphones to besynthesised.

Synthetic routes for the production of phenyl-1-propenyl,pyridine-1-propenyl, phenyl-1-isobutenyl and pyridine-1-isobutenylsulphones are described in the examples towards the end of thisdocument. A synthetic route for the production of1,1,1-trifluoro-3-phenylsulphonylpropene is disclosed by Tsuge H. et al.in J. Chem. Soc. Perkin Trans. 1:2761 -2766, 1995. This reagent is alsoavailable from Aldrich (SigmaAldrich, Dorset, UK).

A second preferred class of reagents for use in this invention aremaleimide compounds. Combinations of these reagents under appropriatemild conditions can allow a high degree of discrimination betweenalpha-amino groups and lysine epsilon-amino groups in amine- labellingreactions. Maleimide compounds are known to react readily with primaryamines giving a mono-alkylated product (see for example : Sharpless N.E. & Flavin M., Biochemistry 5(9): 2963-2971, “The reactions of aminesand amino acids with maleimides. Structure of the reaction productsdeduced from infrared and nuclear magnetic resonance spectroscopy.”1966; Papini A. & Rudolph S. & Siglmuller G. & Musiol H. J. & Gohring W.& Moroder L., hit J Pept Protein Res 39(4): 348-355, “Alkylation ofhistidine with maleimido-compounds.” 1992; Khan M. N., J Pharm Sci73(12): 1767-1771, “Kinetics and mechanism of the alkaline hydrolysis ofmaleimide.” 1984). The inventors have shown that a solid supportderivitised with maleimide (maleimidobutyramidopolystyrene, Fluka) willreact more rapidly with epsilon-amino groups under basic conditions thanwith alpha-amino groups. This reagent is not stable in aqueousconditions, however, and reactions of peptides with this support shouldbe carried out in anhydrous aprotic organic solvents. The use of organicsolvents is acceptable for highly hydrophobic proteins, such as proteinsembedded in cell membranes and as such maleimidobutyrarmidopolystyreneis useful for the analysis of this class of proteins.

Some of the less hindered Michael reagents, such as N-ethylmaleimide(NEM) and the propenyl sulphones will react quite readily with thealpha-amino group of proline. This will not be a problem in most aspectsof this invention as proline is not common and most endoproteases do notcleave at proline linkages anyway. The preferred embodiment of thisinvention relies on cleavage of proteins and polypeptides by Lys-C typeenzymes. Most of the known enzymes of this class will not cleave atLysine-Proline linkages, so the presence of a free proline alpha-aminowill not present a problem. Solid-support bound maleimide alsodiscriminates effectively against proline. It is worth noting thatmaleimide shows only moderate discrimination for epsilon amino groupsover alpha amino groups when used as a solution phase reagent, but thediscrimination of the immobilised reagent is greatly improved. Otherreagents, which show only moderate discrimination in the solution phasemay show improved discrimination when immnobilised on a solid phasesupport.

Further aspects of this invention provide a method of determining the‘expression profile’ of a mixture of polypeptides, i.e. a method toidentify and preferably also to quantify each polypeptide in themixture, and also methods of comparing polypeptides in two or moremixtures (e.g. from two or more separate samples). These methods involvedetermining the expression profile of one or more polypeptides in themixture according to the first embodiment of the invention. Differentlabels may be employed for each sample in the mixture. The labels can beresolved, so that each expression profile, or each individualpolypeptide being compared, still be relatable to a specific sample.Preferred mass labels for use with this invention are disclosed inPCT/GB01/01122, which discloses organic molecule mass markers that areanalysed by selected reaction monitoring. This application discloses twocomponent mass markers connected by a collision cleavable group. Sets oftags are synthesised where the sum of the masses of the two componentsproduces markers with the same overall mass. The mass markers may beanalysed after cleavage from their analyte or may be detected whileattached to the analyte. In this invention the mass markers are detectedwhile attached to the peptide that they are identifying. Selection ofthe mass of the mass marker with its associated peptide by the firstmass analyser of a tandem instrument allows the marked peptides to beabstracted from the background. Collision of the markers in the secondstage of the instrument separates the two components of the tag fromeach other. Only one of these components is detected in the third massanalyser. This allows confirmation that the peak selected in the firstanalyser is a mass marked peptide. The whole process greatly enhancesthe signal to noise ratio of the analysis and improves sensitivity. Thismass marker design also compresses the mass range over which an array ofmass markers is spread. Moreover, it allows the design of markers, whichare chemically identical, have the same mass but which are stillresolvable by mass spectrometry. This is essential for analyticaltechniques such as Liquid Chromatography Mass Spectrometry (LC-MS) wherethe effect of different markers on the mobility of different samples ofpeptides must be minimised so that corresponding peptides from eachsample elute together into the mass spectrometer, allowing the ratios ofthe corresponding peptides to be determined. These markers are thus mostpreferred for the purposes of this invention because of the use of highselectivity detection and the closely related structures of thesemarkers. Other markers may also be applicable, though.

In various embodiments of this invention there is an optional, butpreferred first step, which involves reducing cysteine disulphidebridges and capping of free thiols. The alkenyl sulphone reagents ofthis invention are reactive with free thiols. To prevent interference inthe methods of this invention by free thiols, and to avoid problemsassociated with disulphide bridges in polypeptides, it is preferred thatthe disulphide bridges are reduced to free thiols and that the thiolmoieties are capped prior to application of the methods of thisinvention. Since thiols are very much more reactive than the otherside-chains in a protein this step can be achieved highly selectively.

Various reducing agents have been used for disulphide bond reduction.The choice of reagent may be determined on the basis of cost, orefficiency of reaction and compatibility with the reagents used forcapping the thiols (for a review on these reagents and their use seeJocelyn P. C., Methods Enzymol. 143: 246-256, “Chemical reduction ofdisulfides.” 1987).

Typical capping reagents include N-ethylmaleimide, iodoacetamide,vinylpyridine, 4-nitrostyrene, methyl vinyl sulphone or ethyl vinylsulphone. (see for example Krull L. H. & Gibbs D. E. & Friedman M., AnalBiochem 40(1): 80-85, “2-Vinylquinoline, a reagent to determine proteinsulfhydryl groups spectrophotometrically.” 1971; Masri M. S. & Windle J.J. & Friedman M., Biochem Biophys Res Commun 47(6): 1408-1413,“p-Nitrostyrene: new alkylating agent for sulfhydryl groups in reducedsoluble proteins and keratins.” 1972; Friedman M. & Zahnley J. C. &Wagner J. R., Anal Biochem 106(1): 27-34, ”Estimation of the disulfidecontent of trypsin inhibitors as S-beta-(2-pyridylethyl)-L-cysteine.⇄1980).

Typical reducing agents include mercaptoethanol, dithiothreitol (DTT),sodium borohydride and phosphines such as tributylphosphine (see RueggU. T. & Rudinger J., Methods Enzymol 47:111-116, “Reductive cleavage ofcystine disulfides with tributylphosphine.”, 1977) andtris(carboxyethyl)phosphine (Bums J. A. et al., J Org Chem 56:2648-2650, “Selective reduction of disulfides byTris(2-carboxyethyl)phosphine.”, 1991). Mercaptoethanol and DTT may beless preferred for use with thiol reactive capping reagents as thesecompounds contain thiols themselves. Phosphine based reducing reagentsare compatible with vinyl sulphone reagents (Masri M. S. & Friedman M.,J. Protein Chem. 7 (1): 49-54, ‘Protein reactions with methyl and ethylvinyl sulfones.’1988). It is worth noting that the reduction and thiolblocking may take place simultaneously with the epsilon-amino labellingstep of the second aspect of this invention.

In various embodiments of this invention a sequence specific cleavagereagent is required. Preferred cleavage reagents for use with thisinvention are enzymatic reagents. Trypsin is a preferred enzyme for thecleavage of polypeptides. This reagent is the enzyme most widely usedfor conventional peptide mass fingerprinting. Trypsin is preferred for anumber of reasons. It is a highly robust enzyme, tolerating moderateamounts of detergents and denaturants, while still retaining the abilityto cleave polypeptides. In addition, if cleavage of the polypeptidesproceeds to completion, then each digest peptide has a basic residue ateach terminus of the peptide, except, for the C-terminal peptides andsome blocked N-terminal peptides. The presence of basic residuespromotes protonation of the peptides. Various enzymes that cut apolypeptide or peptide at the amide bond C-terminal to a lysine residueare commercially available, e.g. Endoproteinase Lys-C from Lysobacterenzymogenes (Formerly available from Boehringer Mannheim now from RocheBiochemicals). These enzymes are generically referred to as Lys-C andare also preferred enzymes for use with this invention. Similarly,enzymes that cut a polypeptide or peptide at the amide bond C-terminalto an arginine residue are commercially available and are genericallyreferred to as Arg-C enzymes. These are also preferred enzymes for usewith this invention. Chemical cleavage may also be applicable with thismethod. A reagent such as cyanogen bromide which cleaves at methionineresidues would be appropriate. Chemical cleavage may be advantageousbecause protease inhibitors may be used during the isolation of thesample of polypeptides from its biological source. The use of proteaseinhibitors will reduce non-specific degradation of the sample byendogenous proteases. Chemical reagents can also be readily inactivatedby addition of appropriate quenching reagents.

In certain embodiments of this invention the mass markers comprise anaffinity capture ligand. Affinity capture ligands are ligands, whichhave highly specific binding partners. These binding partners allowmolecules tagged with the ligand to be selectively captured by thebinding partner. Preferably a solid support is derivitised with thebinding partner so that affinity ligand tagged molecules can beselectively captured onto the solid phase support. A preferred affinitycapture ligand is biotin, which can be introduced into the peptide masstags of this invention by standard methods known in the art. Inparticular a lysine residue may be incorporated after amino acid 2through which an amine-reactive biotin can be linked to the peptide masstags ( see for example Geahlen R. L. et al., Anal Biochem 202(1): 68-67,“A general method for preparation of peptides biotinylated at thecarboxy terminus.” 1992; Sawutz D. G. et al., Peptides 12(5): 1019-1012,“Synthesis and molecular characterization of a biotinylated analog of[Lys]bradykinin.” 1991; Natarajan S. et al., Int J Pept Protein Res40(6): 567-567, “Site-specific biotinylation. A novel approach and itsapplication to endothelin-1 analogs and PTH-analog.”, 1992). Iminobiotinis also applicable. A variety of avidin counter-ligands for biotin areavailable, which include monomeric and tetrameric avidin andstreptavidin, all of which are available on a number of solid supports.

Other affinity capture ligands include digoxigenin, fluorescein,nitrophenyl moieties and a number of peptide epitopes, such as the c-mycepitopc, for which selective monoclonal antibodies exist ascounter-ligands. Metal ion binding ligands such as hexahistidine, whichreadily binds Ni²⁺ ions, are also applicable. Chromatographic resins,which present iminodiacetic acid chelated Ni²⁺ ions are commerciallyavailable, for example. These immobilised nickel columns may be used tocapture tagged peptide, which comprise oligomeric histidine. As afurther alternative, an affinity capture functionality may beselectively reactive with an appropriately derivitised solid phasesupport. Boronic acid, for example, is known to selectively react withvicinal cis-diols and chemically similar ligands, such assalicylhydroxamic acid. Reagents comprising boronic acid have beendeveloped for protein capture onto solid supports derivitised withsalicylhydroxamic acid (Stolowitz M. L. et al., Bioconjug Chem 12(2):229-239, “Phenylboronic Acid-Salicylhydroxamic Acid Bioconjugates. 1. ANovel Boronic Acid Complex for Protein Immobilization.” 2001; Wiley J.P. et al., Bioconjug Chem 12(2): 240-250, “PhenylboronicAcid-Salicylhydroxamic Acid Bioconjugates. 2. Polyvalent Immobilizationof Protein Ligands for Affinity Chromatography.” 2001, Prolinx, Inc,Washington State, USA). It is anticipated that it should be relativelysimple to link a phenylboronic acid functionality to the tags of thisinvention to generate capture reagents that can be captured by selectivechemical reactions. The use of this sort of chemistry would not bedirectly compatible with biomolecules bearing vicinalcis-diol-containing sugars, however these sorts of sugars could beblocked with phenylboronic acid or related reagents prior to reactionwith boronic acid derivitised tag reagents.

The methods of this invention can be used to profile populations ofproteins generated in numerous ways. It may be possible to analyse rawprotein extracts from organisms such as yeast directly using the methodsof this invention. Organisms with larger proteomes may requirefractionation of the raw protein extracts from their tissues. Variousfractionation techniques exist to sub-sort proteins on the basis ofcertain features. A population of proteins extracted from a mammaliantissue, for example, is going to contain a significant number ofdistinct protein species. It is thought there are of the order of 10,000transcripts, which may comprise alternatively spliced products fromnumerous genes, expressed in the average human cell (Iyer V. R. et al.,Science 283 (5398) 83-87, “The transcriptional program in the responseof human fibroblasts to serum.” 1999), and experiments with 2-D gelshave shown that similar numbers of proteins spots are found in gels ofproteins extracted from a particular tissue (Kiose J., Kobalz U.,Electrophoresis 16 (6) 1034-59, “Two-dimensional electrophoresis ofproteins: an updated protocol and implications for a functional analysisof the genome.” 1995). It may be desirable to fractionate complexsamples of proteins, such as those that would be isolated from humantissue, prior to application of the methods of this invention tosimplify analysis or to provide additional information, such asidentifying proteins with post-translational modifications.

Fractionation steps can be used to reduce the complexity of a populationof proteins by resolving a protein population into a number of discretesubsets, preferably subsets of a uniform size are desirable. This ismost readily achieved by separation on the basis of global properties ofproteins, that vary over a broad and continuous range, such as size andsurface charge. These are the properties used most effectively in 2-Dgel electrophoresis. Such separations can be achieved more rapidly thangel electrophoresis using liquid chromatographic techniques. Byfollowing one liquid chromatography separation by another, a populationof proteins can be resolved to an arbitrary degree, although a largenumber of sequential chromatographic separation steps could result insample loss or other artefacts due to non-specific adhesion of proteinsor peptides to different chromatographic matrices.

Cell fractionation Proteins are comparmentalised within their cells.Various techniques are known in the art to fractionate proteins on thebasis of their cellular compartments. Fractionation protocols involvevarious cell lysis techniques such as sonication, detergents ormechanical cell lysis that can be followed by a variety of fractionationtechniques, such as centrifugation. Separation into membrane proteins,cytosolic proteins and the major membrane bound subcellularcompartments, such as the nucleus and mitochondria, is standardpractice. Thus certain classes of protein may be effectively ignored orcan be specifically analysed. This form of fractionation may beextremely informative if a particular protein is found in a number ofsubcellular locations since its location is likely to reveal informationabout its function.

Fractionation of Proteins

Since proteins are highly heterogeneous molecules numerous techniquesfor separation of proteins are available. It is possible to separateproteins on the basis of size, hydrophobicity, surface charge and/or byaffinity to particular ligands. Separation is effected by an assortmentof solid phase matrices derivatised with various functionalities thatadhere to and hence slow down the flow of proteins through the column onthe basis of specific properties. Matrices derivitised with hydrophobicmoieties can be used to separate proteins based on their hydrophobicity,while charged resins can be used to separate proteins on the basis oftheir charge. In a typical chromatographic separation, analyte moleculesare injected into columns packed with these a derivitised resin in aloading buffer or solvent that favours adhesion to the solid phasematrix. This is followed by washing the column with steadily increasingquantities of a second buffer or solvent favouring elution. In this waythe proteins with the weakest interactions with a given matrix elutefirst.

Fractionation by Affinity

A population of proteins can be fractionated by affinity methods. Thissort of fractionation method relies on specific interactions betweenproteins, or classes of proteins, with specific ligands.

Many proteins, for example, exist as complexes with other proteins andanalysis of such complexes is often difficult. A cloned protein that isa putative member of a complex can be used to generate an affinitycolumn with the cloned protein acting as an affinity ligand to captureother proteins that normally bind to it. This invention is eminentlysuited to the analysis of such captured protein complexes.

Isolation of Post-translationally Modified Proteins

A large number of affinity ligands are available commercially forspecific applications such as the isolation of proteins withpost-translational modifications. A number of tagging procedures arealso known by which affinity tags such as biotin can be introduced intoproteins that have specific post-translational modifications allowingsuch proteins to be captured using biotin-avidin affinitychromatography.

Isolation of Carbohydrate Modified Proteins Carbohydrates are oftenpresent as a post-translational modification of proteins. Variousaffinity chromatography techniques for the isolation of these sorts ofproteins are known (For a review see Gerard C., Methods Enzymol. 182,529-539, “Purification of glycoproteins.” 1990). A variety of naturalprotein receptors for carbohydrates are known. The members of this classof receptors, known as lectins, are highly selective for particularcarbohydrate functionalities. Affinity columns derivitised with specificlectins can be used to isolate proteins with particular carbohydratemodifications, whilst affinity columns comprising a variety of differentlectins could be used to isolate populations of proteins with a varietyof different carbohydrate modifications. Many carbohydrates havevicinal-diol groups present, i.e. hydroxyl groups present on adjacentcarbon atoms. Diol containing carbohydrates that contain vicinal diolsin a 1,2-cis-diol configuration will react with boronic acid derivativesto form cyclic esters. This reaction is favoured at basic pH but iseasily reversed at acid pH. Resin immobilised derivatives of phenylboronic acid have been used as ligands for affinity capture of proteinswith cis-diol containing carbohydrates. Vicinal-diols, in sialic acidsfor example, can also be converted into carbonyl groups by oxidativecleavage with periodate. Enzymatic oxidation of sugars containingterminal galactose or galactosamine with galactose oxidase can alsoconvert hydroxyl groups in these sugars to carbonyl groups. Complexcarbohydrates can also be treated with carbohydrate cleavage enzymes,such as neuramidase, which selectively remove specific sugarmodifications leaving behind sugars, which can be oxidised. Thesecarbonyl groups can be tagged allowing proteins bearing suchmodifications to be detected or isolated. Biocytin hydrazide (Pierce &Warriner Ltd, Chester, UK) will react with carbonyl groups incarbonyl-containing carbohydrate species (E. A. Bayer et al. Anal.Biochem. 170, 271-281, “Biocytin hydrazide—a selective label for sialicacids, galactose, and other sugars in glycoconjugates using avidinbiotin technology”, 1988). Alternatively a carbonyl group can be taggedwith an amine modified biotin, such as Biocytin and EZ-Link® PEO-Biotin(Pierce & Warriner Ltd, Chester, UK), using reductive alkylation (MeansG. E., Methods Enzymol 47, 469-478, “Reductive alkylation of aminogroups.” 1977; Rayment I., Methods Enzymol 276: 171-179, “Reductivealkylation of lysine residues to alter crystallization properties ofproteins.” 1997). Proteins bearing vicinal-diol containing carbohydratemodifications in a complex mixture can thus be biotinylated.Biotinylated, hence carbohydrate modified, proteins may then be isolatedusing an avidinated solid support.

Peptides may then be isolated and analysed from the capturedcarbohydrate bearing proteins isolated using the above methods.

Isolation of Phosphorylated Proteins

Phosphorylation is a ubiquitous reversible post-translationalmodification that appears in the majority of signalling pathways ofalmost all organisms. It is an important area of research and toolswhich allow the analysis of the dynamics of phosphorylation areessential to a full understanding of how cells responds to stimuli,which includes the responses of cells to drugs.

A number of research groups have reported on the production ofantibodies, which bind to phosphotyrosine residues in a wide variety ofproteins. (see for example A. R. Frackelton et al., Methods Enzymol.201, 79-92, “Generation of monoclonal antibodies against phosphotyrosineand their use for affinity purification of phosphotyrosine-containingproteins.”, 1991 and other articles in this issue of Methods Enzymol.).This means that a significant proportion of proteins that have beenpost-translationally modified by tyrosine phosphorylation may beisolated by affinity chromatography using these antibodies as theaffinity column ligand.

These phosphotyrosine binding antibodies can be used in the context ofthis invention to isolate peptides from proteins containingphosphotyrosine residues. The tyrosine-phosphorylated proteins in acomplex mixture may be isolated using anti-phosphotyrosine antibodyaffinity columns. The peptides from the fractionated mixture ofphosphoproteins may then be isolated and analysed according to themethods of this invention.

Techniques for the analysis of phosphoserine and phosphothreoninecontaining peptides are also known. One class of such methods is based awell known reaction for beta-elimination of phosphates. This reactionresults in phosphoscrine and phosphothreonine forming dehydroalanine andmethyldehydroalanine, both of which are Michael acceptors and will reactwith thiols. This has been used to introduce hydrophobic groups foraffinity chromatography (See for example Holmes C. F., FEBS Lett 215 (1)21-24, “A new method for the selective isolation ofphosphoserine-containing peptides.” 1987). Dithiol linkers have alsobeen used to introduce fluorescein and biotin into phosphoserine andphosphothreonine containing peptides (Fadden P, Haystead T A, AnalBiochem 225

(1) 81-8, “Quantitative and selective fluorophore labelling ofphosphoserine on peptides and proteins: characterization at the attomolelevel by capillary electrophoresis and laser-induced fluorescence.”1995; Yoshida O. et al., Nature Biotech 19, 379-382, “Enrichmentanalysis of phosphorylated proteins as a tool for probing thephosphoproteome”, 2001). The use of biotin for affinity enrichment ofproteins phosphorylated at serine and threonine could be used with themethods of this invention so that only the terminal peptides need to beanalysed. Similarly anti-fluorescein antibodies are known which wouldallow fluorescein tagged peptides to be selectively isolated withaffinity chromatography. This could be followed by peptide isolation andanalysis according to the methods of this invention.

A chemical procedure for the isolation of phosphoproteins onto solidphase supports has also been published (Zhou H et al., Nature Biotech19, 375-378, “A systematic approach to the analysis of proteinphosphorylation”, 2001). This procedure relies on the fact thatphosphoramidates hydrolyse easily under acid conditions. The procedureinvolves capping all free amines in a mixture of proteins, followed byblocking all free phosphates and carboxyl groups by coupling thephosphates and carboxyls with a capping group containing an aminefunctionality to form the corresponding phosphoramidates and amides. Theblocked proteins are then treated with acid to unblock the phosphates.The peptides are then reacted with a second amine reagent carrying aprotected thiol. This step blocks the phosphates again. The protectedthiol was deprotected and used to capture the phosphopeptidesselectively onto a thiol reactive resin. These peptides could then bereleased by acid hydrolysis, after thorough washing of the resin. Thisprocedure is claimed to be applicable to all phosphate groups butphosphotyrosine is acid labile and so the method is unlikely toapplicable to phosphotyrosine.

Immobilised Metal Affinity Chromatography (IMAC) represents a furthertechnique for the isolation of phosphoproteins and phosphopeptides.Phosphates adhere to resins comprising trivalent metal ions particularlyto Gallium(III) ions (Posewitch, M. C. and Tempst, P., Anal. Chem., 71:2883-2892, “Immobilized Gallium (III) Affinity Chromatography ofPhosphopeptides”, 1999). This technique is advantageous as it canisolate both serine/threonine phosphorylated and tyrosine phosphorylatedpeptides and proteins simultaneously.

IMAC can therefore also be used in the context of this invention for theanalysis of samples of phosphorylated proteins. In an alternativeembodiment of the second aspect of this invention, a sample ofphosphorylated proteins may be analysed by isolating phosphorylatedproteins followed by analysis of the peptides of the phosphoproteins. Aprotocol for the analysis of a sample of proteins, which containsphosphorylated proteins, would comprise the steps of:

-   -   1. passing the protein sample through an affinity column        comprising immobilised metal ions to isolate only phosphorylated        proteins,    -   2. isolating and analysing the peptides from the captured        Phosphorylated proteins using the methods of this invention,

Other Post-translational Modifications of Proteins

Proteins that have been modified by ubiquitination, lipoylation andother post-translational modifications may also be isolated or enrichedby chromatographic techniques (Gibson J. C ., Rubinstein A., Ginsberg H.N. & Brown W. V., Methods Enzymol 129, 186-198, “Isolation ofapolipoprotein E-containing lipoproteins by immunoaffinitychromatography.” 1986; Tadey T. & Purdy W. C. J Chromatogr. B Biomed.Appl. 671 (1-2), 237-253, “Chromatographic techniques for the isolationand purification of lipoproteins.” 1995) or affinity ligand basedtechniques such as immunoprecipitation (Hershko A., Eytan E.,Ciechanover A. & Haas A. L., J. Biol. Chem. 257, (23) 13964-13970,“Immunochemical analysis of the turnover of ubiquitin-protein conjugatesin intact cells. Relationship to the breakdown of abnormal proteins.”1982). Populations of proteins with these modifications can all beanalysed by the methods of this invention.

ID preferred embodiments of this invention the lysine-selective tagscomprise sensitivity enhancing groups. Various functionalities can beused as sensitivity enhancing groups. The choice of functionality islargely determined by the mass spectrometric analysis technique to be beused. The guanidino group and the tertiary amino group ale both usefulSensitivity Enhancing Groups for electrospray mass spectrometry(Francesco L. Branca, Stephen G. Oliver and Simon J. Gaskell, RapidCommun. in Mass Spec., 14, 2070-2073, “Improved matrix-assisted laserdesorption/ionization mass spectrometric analysis of tryptichydrolysates of proteins following guanidination of lysine-containingpeptides.” 2000).

Various other methods for derivatising peptides have been also beendeveloped. These include the use of quaternary ammonium derivatives,quaternary phosphonium derivatives and pyridyl derivatives for positiveion mass spectrometry. Halogenated compounds, particularly halogenatedaromatic compounds are well known electrophores, i.e. they pick upthermal electrons very easily. A variety of derivatisation reagentsbased on fluorinated aromatic compounds (Bian N. et al., Rapid CommunMass Spectrom 11(16): 1781-1784, “Detection via laser desorption andmass spectrometry of multiplex electrophore-labelled albumin.” 1997)have been developed for electron capture detection, which is a highlysensitive ionisation and detection process that can be used withnegative ion mass spectrometry (Abdel-Baky S. & Giese R. W., Anal. Chem.63(24):2986-2989, “Gas chromatography/electron capture negative-ion massspectrometry at the zeptomole level.” 1991). A fluorinated aromaticgroup could also be used as a sensitivity enhancing group.

Aromatic sulphonic acids have also been used for improving sensitivityin negative ion mass spectrometry.

Each type of sensitivity enhancing group has different benefits, whichdepend on the method of ionisation used and on the methods of massanalysis used. The mechanism by which sensitivity is enhanced may alsobe different for each type of group. Some derivitisation methodsincrease basicity and thus promote protonation and charge localisation,while other methods increase surface activity of the tagged peptides,which improves sensitivity in surface desorption techniques like MatrixAssisted Laser Desorption lonisation (MALDI) and Fast Atom Bombardment(FAB). Negative ion mass spectrometry is often more sensitive becausethere is less background noise. Charge derivitisation can also changethe fragmentation products of derivatised peptides, when collisioninduced dissociation is used. In particular some derivatisationtechniques simplify fragmentation patterns, which is highlyadvantageous, if peptides are to be analysed by techniques such ascollision induced dissociation. The choice of Sensitivity EnhancingGroup is determined by the mass spectrometric techniques that will beemployed (for a review see Roth et al., Mass Spectrometry Reviews17:255-274, “Charge derivatization of peptides for analysis by massspectrometry”, 1998). For the purposes of this invention all of theknown sensitivity enhancing groups could be used with thelysine-selective tags of this invention.

In preferred embodiments of this invention, the lysine-selectivealkenylsulphone reagents comprise a non-fluorescent dye. Preferably, thetags comprise a dye that has a high extinction coefficient for aparticular frequency of light and which dissipates absorbed energythrough vibrational modes. Some examples of such dyes are used asmatrices for MALDI-TOF mass spectrometry where excitation of the dyes bylaser light leads to rapid sublimation of the dyes. This sublimationprocess also vaporises any co-crystallised material. Cinnamic acidderivatives are preferred dyes that are widely used in MALDI TOF (BeavisR C, Chait B T, Rapid Commun Mass Spectrom 3(12):432-435, “Cinnamic acidderivatives as matrices for ultraviolet laser desorption massspectrometry of proteins.” 1989). The inventors have found, inco-pending application to be filed, that covalently linking derivativesof cinnamic acid, and other dyes?, to peptides greatly increases theyield of ions from the attached peptides. Therefore, alkenyl sulphonereagents, that comprise cinnamic acid derivatives are preferred tags foruse with this invention.

In some embodiments of this invention, the alpha-amino groups ofpeptides from digested polypeptides may be tagged with sensitivityenhancing groups. N-hydroxysuccinimide esters of cinnamic acidderivatives, such as 4-hydroxy-alpha-cyano-cinnamic acid, may beexcellent tags for this purpose.

Determination of Peptide Mass Fingerprints

Some of the less hindered Michael reagents, such as N-ethylmaleimide(NEM) and the propenyl sulphones will react quite readily with proline.This will not be a problem in most aspects of this invention as prolineis not common and most endoproteases do not cleave at proline linkagesanyway. Some aspects of this invention rely on cleavage of proteins andpolypeptides by Lys-C type enzymes. Most of the known enzymes of thisclass will not cleave at Lysine-Proline linkages, so the presence of afree proline alpha-amino is unlikely unless it occurs at the N-terminusof a protein. Similarly trypsin will not cleave at lysine-proline orarginine-proline linkages and is useable in the first and second aspectsof this invention to avoid the production of free proline alpha-aminogroups. An N-terminal proline refill only be a problem for thisinvention where the proline is unblocked. Improved proline lysinediscrimination is, however, found in the more- hindered alkenylsulphones such as the isobutenyl sulphones, the trifluoropropenylsulphones and the lexafluoroisobutenyl sulphones, so these reagentsshould be used if discrimination against proline is required.

In one embodiment of this invention, which describes a general method toproduce peptide mass fingerprints of lysine labelled polypeptides, thediscrimination of the hindered sulphones is used to specifically labelepsilon-amino groups. This reaction follows cleavage of the polypeptideor mixture of polypeptides with a sequence specific cleavage reagent,which can be enzymatic such as trypsin or can be chemical such ascyanogen bromide. The cleavage of the mixture of polypeptides with thesequence specific cleavage reagent will expose alpha-amino groups in allthe resulting cleavage peptides. The lysine-selective tags of thisinvention are reacted with the digested peptides and the tags willselectively react with epsilon-amino groups in preference to anyalpha-amino groups that are available. These labelled peptides are thenanalysed by mass spectrometry to determine a peptide mass fingerprintfrom the labelled peptides.

In certain embodiments of this embodiment of the invention, thelysine-selective tags may comprise an affinity tag. This allows peptidesthat have been labelled on their lysine residues to be selectivelyisolated from peptides that do not contain lysine and which maycontaminate the mass spectrum generated from the labelled digest. In onepreferred embodiment, the polypeptide or polypeptides to be analysed aredigest with trypsin as shown in FIG. 2. Trypsin cleaves at both arginineand lysine generating peptides that terminate at both arginine andlysine. If the digestion is allowed to proceed to completion, thencleavage will have taken place at substantially all of the availablelysine and arginine residues and each of the digest peptides willcontain only one lysine or arginine residue except for C-terminalpeptides, which will contain neither lysine nor arginine. This meansthat labelling the digest peptides with a lysine-selective tag willintroduce one and only one tag into those peptides that contain lysine.If, as shown in FIG. 4, the lysine-selective tag comprises an affinitytag, like biotin, then the lysine containing peptides that are labelledcan be isolated by affinity chromatography, using an avidinated solidsupport if the affinity tag is biotin. This results in a reduced subsetof the digest peptides that can then be analysed by MALDI TOF massspectrometry to determine a peptide mass fingerprint for the chosenpolypeptide or polypeptides. The use of an affinity tag is highlyadvantageous as isolation of the peptides allows the lysine containingpeptides to be separated from arginine containing peptides. This reducesthe potential for competition during ionisation, which favoursionisation of arginine containing peptides. Furthermore, the labelledpeptides can be isolated from lysine-containing peptides that have notreacted with the tags and from terminal peptides that do not containlysine or arginine. In addition, capturing the peptides onto a solidsupport allows the isolated peptides to be conditioned for massspectrometry. This means that any detergents, denaturants and polymericbuffering agents that may have been used during the isolation of thepolypeptide, during digestion of the polypeptide or during labelling ofthe polypeptide can be washed away. Non-volatile buffer components suchas metal ions from the peptide buffers can also be exchanged forammonium ions by washing the peptides on the support with appropriateammonium ion containing buffers to ensure that metal ion adducts of thepeptides do not contaminate the mass spectrum of the peptide digest. Ifbiotin and avidin are used as the affinity tag and counter-ligandrespectively then the sequence specific cleavage reagent used to digestthe polypeptide or polypeptides must be inactivated before the labelledpeptides are isolated onto an avidinated support. If the cleavage agentis still active it will digest the avidinated support releasing thecaptured peptides. The labelling of lysine functionalities mayinactivate the cleavage reagent if it is enzymatic but it is alsopreferable to add an inhibitor of the enzyme after digestion iscomplete. The lysine-selective tag comprising the affinity tag mayadditionally comprise a sensitivity enhancing group to improve thepeptide mass fingerprint. Alternatively, the captured peptides, which inmost cases will have exposed alpha-amino groups, can be reacted with anamino-reactive reagent thereby linking a sensitivity enhancing group tothe peptide. Peptides derived from the N-terminus of a polypeptide aresometimes blocked and so these blocked peptides would not be labelled ifan alpha-amino labelling strategy is used.

In other embodiments of this invention, lysine selective tags thatcomprise affinity tags can be used in conjunction with sequence specificcleavage agents other than trypsin. FIG. 3 shows the use of Lys-C.Lys-C, however, is sometimes less preferred than trypsin as the peptidesthat result from the digestion of a polypeptide with Lys-C may containone or more arginine residues which may compete for ionisation withpeptides that do not contain arginine and may also promote the formationof ions that have been multiply protonated. However, Lys-C isadvantageous as it is possible to ensure that each peptide receives onlyone tag per peptide assuming the digestion goes to completion andbearing in mind that most C-terminal peptides will not have a lysineafter cleavage of a polypeptide with Lys-C. Trypsin will generatepeptides that have either a C-terminal lysine or a C-terminal arginine.Arginine containing peptides tend to be detected more readily byMALDI-TOF mass spectrometry, because of the very basicguanidino-functionality of arginine (Krause E. & Wenschuh H. & JungblutP. R., Anal Chem. 71(19): 4160-4165, “The dominance ofarginine-containing peptides in MALDI-derived tryptic mass fingerprintsof proteins.” 1999). Lysine selective tags can be used to selectivelylabel peptides that do not contain arginine. The tags can be used tointroduce a guanidino-functionality, which can help to facilitate thedetection of lysine containing peptides (Brancia et al., Electrophoresis22: 552-559, “A combination of chemical derivitisation and improvedbioinformatics tools optimises protein identification for proteomics”,2001).

Expression Profiling and Peptide Mass Fingerprints

This embodiment of the invention provides methods of comparing theexpression levels of polypeptides in different samples using peptidemass fingerprinting. To compare the expression profile of two samples itis necessary to determine the identity and relative quantities of eachof the component polypeptides in the two samples. This embodimentprovides methods to determine both the identity and the relativequantities of each of the component polypeptides in two or moredifferent samples. To achieve this the polypeptides in each sample arelabelled with labels that can be resolved by mass spectrometry. Thelabelled polypeptides are then pooled. The components of the pooledsamples are resolved from each other by separating the components usingelectrophoretic or chromatographic procedures. The separated proteinscan then be identified by peptide mass fingerprinting. The use of thelabelling procedures described in this invention also allows therelative levels of each component polypeptide to be determined duringthe mass spectrometric identification of the polypeptides.

Direct quantification of analyte:, by mass spectrometry is highlyunreliable and accurate quantification by mass spectrometry is generallyachieved by comparing an analyte with a ‘standard’ which usuallycomprises a known quantity of the same material that has an isotopicallydifferent mass from the analyte. The standard is usually spiked into thesample just before analysis. The ratio of the analyte to the standardcan be used to calculate the quantity of the material. In somesituations, the exact quantity is not necessary and the ratio of twoisotopes of a substance is sufficient. This is true for proteinexpression profiling. It is sufficient to be able to determine the ratioof the same polypeptide in different sample to understand how thesamples differ from each other. To achieve this the two polypeptidesmust be isotopically differentiated prior to analysis by massspectrometry. This can be achieved by labelling the polypeptides in eachsample with different isotopes of a tag compound. In the context of thisinvention, the labelling of polypeptides in different samples takesplace prior to the separation of the polypeptides in the samples. Thismeans that the labels must not change the chromatographic behaviour ofthe labelled proteins. Preferred labels with the necessary propertiesfor use with this invention are disclosed in PCT/GB01/01122, whichdiscloses organic molecule mass markers that are analysed by selectedreaction monitoring in a mass spectrometer capable of serial massanalyses, such as an ion trap or triple quadrupole mass spectrometer.PCT/GB01/01122 discloses mass markers, which have two componentsconnected by a collision cleavable group. Sets of tags are synthesisedwhere the sum of the masses of the two components produces markers withthe same overall mass. If each of the components of the mass marker aredifferent isotopes then it is possible to create mass markers that arechemically identical and which have the same mass but have a differentmass distribution on either side of the collision cleavable bond. Themass markers may be analysed after cleavage from their analyte or may bedetected while attached to the analyte. In the context of the presentinvention the mass markers, disclosed in PCT/GB01/01122. are detectedwhile attached to the peptide that they are identifying. Selection ofthe mass of the mass marker with its associated peptide by the firstmass analyser of a tandem instrument allows the marked peptides to beabstracted from the background. If two identical peptides with differenttags are present, i.e. peptides from different samples; they will havethe same mass in the first stage of analysis and will be selectedtogether. Collision of the marked peptides in the second stage of theinstrument separates the two components of the tags from each other.Only one of these components for each tag is deter .ted in the thirdmass analyser. The ratio of the intensities of the tag fragments fromeach peptide is a direct measure of the ratio of the peptides in theoriginal sample material. The identification of the tag fragments alsoprovides confirmation that the peak selected in the first analyser is amass marked peptide. The whole process greatly enhances the signal tonoise ratio of the analysis and improves sensitivity. This mass markerdesign also compresses the mass range over which an array of massmarkers is spread. Moreover, it allows the design of markers, which arechemically identical, have the same mass but which are still resolvableby mass spectrometry. This is essential for analytical techniques suchas 2-D gel electrophoresis or Liquid Chromatography Mass Spectrometry(LC-MS) where the effect of different markers on the mobility ofdifferent samples of peptides must be minimised so that correspondingpeptides and polypeptides from each sample move together duringfractionation procedures. This is essential to allow the ratios of thecorresponding peptides to be determined. These markers are thus mostpreferred for the purposes of this invention because of the use of highselectivity detection and the closely related structures of thesemarkers. The label compounds disclosed in PCT/GB01/01122, can bemodified so that they will react with polypeptides using the preferredalkenyl sulphone reactive functionalities provided by this invention.Other markers may also be applicable, though.

A set or array of labels, of the form disclosed in PCT/GB01/01122, canbe used with the methods of the present invention to increase thethroughput of a 2-D gel electrophoresis analysis of the polypeptides ina biological sample. Each of the mass labels alters the mobility of itsassociated polypeptide in the same way but is still independentlydetectable. If the tags used comprise a group that can be immobilised ona solid support, such as biotin, then the proteins can be immobilised onan avidinated resin to allow conditioning for mass spectrometricanalysis.

In a preferred embodiment of the invention, a method is provided for theanalysis of a series of polypeptide containing samples, each samplecontaining more than one polypeptide, the method comprising the stepsof:

-   -   1. Covalently reacting the polypeptides of each of the samples        with at least one discretely resolvable mass label, such that        the polypeptides of each sample are labelled with one or more        mass labels that are different from the labels reacted with the        proteins of every other sample.    -   2. Pooling the mass labelled samples.    -   3. Optionally separating the pooled samples by gel        electrophoresis, iso-electric focusing, liquid chromatography or        other appropriate means to generate discrete fractions. These        fractions may be bands or spots on a gel or liquid fractions        from a chromatographic separation. Fractions from one separation        may be separated further using a second separation technique.        Similarly further fractions may be fractionated again until the        proteins are sufficiently resolved for the subsequent analysis        steps.    -   4. Digesting the polypeptide or polypeptides in each fraction        with a sequence specific cleavage reagent    -   5. Analysing the digests by mass spectrometry, to identify the        polypeptides in the fraction and to detect the labels attached        to the proteins.

FIG. 6 shows a suitable labelling procedure for use in anotherembodiment of this invention. In this figure the procedure is shown fora single polypeptide and the separation steps are omitted. In the firststep of FIG. 6, the polypeptide is treated with a reducing agent tobreak disulphide bridges in the molecule followed by capping of the freethiols that result. Free epsilon amino groups are then capped with ahindered alkenyl sulphone tag. If different samples are to be comparedthen a different tag would be used for each sample. At this stage,labelled samples would be pooled and any fractionation procedures thatare necessary would be performed. The final step of FIG. 6 showsdigestion of the labelled polypeptide with trypsin, which can now onlycleave at arginine residues. This step would take place afterfractionation of the labelled polypeptides if a complex mixture is to beanalysed.

A further preferred embodiment of the present invention provides amethod of identifying a protein in a sample containing more than oneprotein, the method comprising the steps of:

-   -   1. Covalently reacting the proteins of the sample with at least        one discretely resolvable mass label from the sets and arrays of        this invention.    -   2. Optionally separating the proteins by gel electrophoresis,        iso-electric focusing, liquid chromatography or other        appropriate means to generate discrete fractions. These        fractions may be bands or spots on a gel or liquid fractions        from a chromatographic separation. Fractions from one separation        may be separated further using a second separation technique.        Similarly further fractions may be fractionated again until the        proteins are sufficiently resolved for the subsequent analysis        steps.    -   3. Digesting the proteins in the fraction with a sequence        specific cleavage reagent.    -   4. Optionally reacting the proteins in the sample with an        additional mass label.    -   5. Analysing the digested fractions by liquid chromatography        mass spectrometry where the elution time of mass marked peptides        from the liquid chromatography column step is determined by        detecting the mass labels attached to the peptides. A mass        spectrometry analysis is performed, preferably according to an        aspect of this invention, to detect the labels attached to the        proteins.    -   6. Analysing the digests by mass spectrometry, to identify the        polypeptides in the fraction and to detect the labels attached        to the proteins.

In the above preferred embodiments of this aspect of the invention, thestep of fractionating the proteins is preferably effected by performing2-dimensional gel electrophoresis, using iso-electric focusing in thefirst dimension and SDS PAGE in the second dimension. Typically, the gelis visualised to identify where proteins have migrated to on the gel.Visualisation of the gel is typically performed by staining the gel toreveal protein spots. Various staining procedures and reagents have beendeveloped, although many stains are not compatible with massspectrometry or require extensive de-staining procedures prior to massspectrometry. Silver staining is generally regarded as one of the mostsensitive staining procedures although it requires de-staining prior tomass spectrometry (Gharahdaghi F et al., Electrophoresis 20(3):601-605,“Mass spectrometric identification of proteins from silver-stainedpolyacrylamide gel: a method for the removal of silver ions to enhancesensitivity.” 1999). Novel fluorescent stains that are compatible withmass spectrometry have also been developed. (Lopez M F et al.,Electrophoresis 21(17):3673-3683, “A comparison of silver stain andSYPRO Ruby Protein Gel Stain with respect to protein detection intwo-dimensional gels and identification by peptide mass profiling.”2000). For the purposes of this invention, any of the conventionalstaining procedures that are compatible with mass spectrometry may beused with the methods of this invention. The proteins in each spot arethen identified. There are two approaches to this. In the firstapproach, the proteins are extracted from the gel. Roboticinstrumentation can be used to excise the protein containing spots fromthe gel. The proteins are then extracted from the excised gel spot.These extracted proteins are then digested and the digest peptides fromthe polypeptides are analysed by mass spectrometry to determine apeptide mass fingerprint, usually by MALDI TOF mass spectrometry butelectrospray mass spectrometry is also quite widely used. Proteins canalso be extracted by electroblotting onto a polyvinylidene difluoridemembrane after which enzymatic digestion of the proteins can take placeon the membrane (Vestling MM, Fenselau C, Biochemn Soc Trans22(2):547-551, “Polyvinylidene difluoride (PVDF): an interface for gelelectrophoresis and matrix-assisted laser desorption/ionisation massspectrometry”, 1994). In the second approach the polypeptides aredigested in the gel, and the digest peptides are extracted from the gelor from excised gel spots for determination of peptide mass fingerprintsby mass spectrometry (Lamer S,. Jungblut P R, J Chromatogr B Biomed SciAppl 752(2):311-322, “Matrix-assisted laser desorption-ionisation massspectrometry peptide mass fingerprinting for proteome analysis:identification efficiency after on-blot or in-gel digestion with andwithout desalting procedures.” 2001).

In step 4, the digested proteins are optionally reacted with anadditional mass label of this invention. Most enzymatic digestions andsome of the chemical cleavage methods leave free amines on the resultantpeptides of the digested fractionated proteins which can be reacted witha mass label. This means that the same label will appear on all peptidesand can be detected selectively to maximise the sensitivity of thisanalysis. This label could comprise a sensitivity enhancingfunctionality, preferably the tag comprises a cinnamic acid derivative.

Analysis of peptides by mass spectrometry The essential features of amass spectrometer are as follows:

Inlet System->Ion Source->Mass Analyser->Ion Detector->Data CaptureSystem

There are preferred inlet systems, ion sources and mass analysers forthe purposes of analysing peptides.

Inlet Systems

A variety of mass spectrometry techniques are compatible with separationtechnologies particularly capillary zone electrophoresis and HighPerformance Liquid Chromatography (HPLC). The choice of ionisationsource is limited to some extent if a separation is required asionisation techniques such as MALDI and FAB (discussed below) whichablate material from a solid surface are less suited to chromatographicseparations. For most purposes, it has been very costly to link achromatographic separation in-line with mass spectrometric analysis byone of these techniques. Dynamic FAB and ionisation techniques based onspraying such as electrospray, thermospray and APCI are all readilycompatible with in-line chromatographic separations and equipment toperform such liquid chromatography mass spectrometry analysis iscommercially available. Ionisation techniques For many biological massspectrometry applications so called ‘soft’ ionisation techniques areused. These allow large molecules such as proteins and nucleic acids tobe ionised essentially intact. The liquid phase techniques allow largebiomolecules to enter the mass spectrometer in solutions with mild pHand at low concentrations. A number of techniques are appropriate foruse with this invention including but not limited to Electrospraylonisation Mass Spectrometry (ESI-MS), Fast Atom Bombardment (FAB),Matrix Assisted Laser Desorption lonisation Mass Spectrometry (MALDI MS)and Atmospheric Pressure Chemical Ionisation Mass Spectrometry(APCI-MS).

Electrospray Ionisation

Electrospray ionisation requires that the dilute solution of the analytemolecule is ‘atomised’ into the spectrometer, i.e. injected as a finespray. The solution is, for example, sprayed from the tip of a chargedneedle in a stream of dry nitrogen and an electrostatic field. Themechanism of ionisation is not fully understood but is thought to workbroadly as follows. In a stream of nitrogen the solvent is evaporated.With a small droplet, this results in concentration of the analytemolecule. Given that most biomolecules have a net charge this increasesthe electrostatic repulsion of the dissolved molecule. As evaporationcontinues this repulsion ultimately becomes greater than the surfacetension of the droplet and the droplet disintegrates into smallerdroplets. This process is sometimes referred to as a ‘Coulombicexplosion’. The electrostatic field helps to further overcome thesurface tension of the droplets and assists in the spraying process. Theevaporation continues from the smaller droplets which, in turn, explodeiteratively until essentially the biomolecules are in the vapour phase,as is all the solvent. This technique is of particular importance in theuse of mass labels in that the technique imparts a relatively smallamount of energy to ions in the ionisation process and the energydistribution within a population tends to fall in a narrower range whencompared with other techniques. The ions are accelerated out of theionisation chamber by the use of electric fields that are set up byappropriately positioned electrodes. The polarity of the fields may bealtered to extract either negative or positive ions. The potentialdifference between these electrodes determines whether positive ornegative ions pass into the mass analyser and also the kinetic energywith winch these ions enter the mass spectrometer. This is ofsignificance when considering fragmentation of ions in the massspectrometer. The more energy imparted to a population of ions the morelikely it is that fragmentation will occur through collision of analytemolecules with the bath gas present in the source. By adjusting theelectric field used to accelerate ions from the ionisation chamber it ispossible to control the fragmentation of ions. This is advantageous whenfragmentation of ions is to be used as a means of removing tags from alabelled biomolecule. Electrospray ionisation is particularlyadvantageous as it can be used in-line with liquid chromatography,referred to as Liquid Chromatography Mass Spectrometry (LC-MS).

Matrix Assisted Laser Desorption Ionisation (MALDI)

MALDI requires that the biomolecule solution be embedded in a largemolar excess of a photo-excitable ‘matrix’. The application of laserlight of the appropriate frequency results in the excitation of thematrix which in turn leads to rapid evaporation of the matrix along withits entrapped biomolecule. Proton transfer from the acidic matrix to thebiomolecule gives rise to protonated forms of the biomolecule which canbe detected by positive ion mass spectrometry, particularly byTime-Of-Flight (TOF) mass spectrometry. Negative ion mass spectrometryis also possible by MALDI TOF. This technique imparts a significantquantity of translational energy to ions, but tends not to induceexcessive fragmentation despite this. Accelerating voltages can again beused to control fragmentation with this technique though. This techniqueis highly favoured for the determination of peptide mass fingerprintsdue to its large mass range, due to the prevalence of singly chargedions in its spectra and due to the ability to analyse multiple peptidessimultaneously.

Fast Atom Bombardment

Fast Atom Bombardment (FAB) has come to describe a number of techniquesfor vaporising and ionising relatively involatile molecules. In thesetechniques a sample is desorbed from a surface by collision of thesample with a high energy beam of xenon atoms or caesium ions. Thesample is coated onto a surface with a simple matrix, typically a nonvolatile material, e.g. m-nitrobenzyl alcohol (NBA) or glycerol. FABtechniques are also compatible with liquid phase inlet systems—theliquid eluting from a capillary electrophoresis inlet or a high pressureliquid chromatography system pass through a frit, essentially coatingthe surface of the frit with analyte solution which can be ionised fromthe frit surface by atom bombardment.

Mass Analysers

Fragmentation of peptides by collision induced dissociation, todetermine their sequence, may be used in this invention to identifyproteins, not identified by the pattern of masses of their digestionproducts. Various mass analyser geometries may be used to fragmentpeptides and to determine the mass of the fragments.

MS/MS and MS Analysis of Peptides

Tandem mass spectrometers allow ions with a pre-determinedmass-to-charge ratio to be selected and fragmented by collision induceddissociation (CID). The fragments can then be detected providingstructural information about the selected ion. When peptides areanalysed by CID in a tandem mass spectrometer, characteristic cleavagepatterns are observed, which allow the sequence of the peptide to bedetermined. Natural peptides typically fragment randomly at the amidebonds of the peptide backbone to give series of ions that arecharacteristic of the peptide. CID fragment series are denoted a_(n),b_(n), c_(n), etc. for cleavage at the n^(th) peptide bond where thecharge of the ion is retained on the N-terminal fragment of the ion.Similarly, fragment series are denoted X_(n), y_(n), z_(n), etc. wherethe charge is retained on the C-terminal fragment of the ion.

Trypsin, Lys-C and thrombin are favoured cleavage agents for tandem massspectrometry as they produce peptides with basic groups at both ends ofthe molecule, i.e. the alpha-amino group at the N-terminus and lysine orarginine side-chains at the C-terminus. This favours the formation ofdoubly charged ions, in which the charged centres are at oppositetermini of the molecule. These doubly charged ions produce bothC-terminal and N-terminal ion series after CID. This assists indetermining the sequence of the peptide. Generally speaking only one ortwo of the possible ion series are observed in the CID spectra of agiven peptide. In low-energy collisions typical of quadrupole basedinstruments the b-series of N-terminal fragments or the y-series ofC-terminal fragments predominate. If doubly charged ions are analysedthen both series are often detected. In general, the y-series ionspredominate over the b-series.

In general peptides fragment via a mechanism that involves protonationof the amide backbone follow by intramolecular nucleophilic attackleading to the formation of a 5-membered oxazolone structure andcleavage of the amide linkage that was protonated (Schlosser A. andLehnmann W.D. J. Mass Spectrom. 35: 1382-1390, “Five-membered ringformation in unimolecular reactions of peptides: a key structuralelement controlling low-energy collision induced dissociation”, 2000).FIG. 16 a shows one proposed mechanism by which this sort offragmentation takes place. This mechanism requires a carbonyl group froman amide bond adjacent to a protonated amide on the N-terminal side ofthe protonated amide to carry out the nucleophilic attack. A chargedoxazolonium ion gives rise to b-series ions, while proton transfer fromthe N-terminal fragment to the C-terminal fragment gives rise toy-series ions as shown in FIG. 16 a. This requirement for anappropriately located carbonyl group does not account for cleavage atamide bonds adjacent to the N-terminal amino acid, when the N-terminusis not protected and, in general, b-series ions are not seen for theamide between the N-terminal and second amino acid in a peptide.However, peptides with acetylated N-termini do meet the structuralrequirements of this mechanism and fragmentation can take place at theamide bond immediately after the first amino acid by this mechanism.

The ease of fragmentation of the amide backbone of a polypeptide orpeptide is also significantly modulated by the side chainfunctionalities of the peptide. Thus the sequence of a peptidedetermines where it will fragment most easily. In general it isdifficult to predict which amide bonds will fragment easily in a peptidesequence. This has important consequences for the design of the peptidemass tags of this invention. However, certain observations have beenmade that allow peptide mass tags that fragment at the desired amidebond to be designed. Proline, for example, is known to promotefragmentation at its N-terminal amide bond (Schwartz B. L., Bursey M.M., Biol. Mass Spectrom. 21:92, 1997) as fragmentation at the C-terminalamide gives rise to an energetically unfavourable strained bicyclicoxazolone structure. Aspartic acid also promotes fragmentation at itsN-terminal amide bond. Asp-Pro linkages, however, are particularlylabile in low energy CID analysis (Wysocki V. H. et al., J Mass Spectrom35(12): 1399-1406, “Mobile and localized protons: a framework forunderstanding peptide dissociation.” 2000) and in this situationaspartic acid seems to promote the cleavage of the amide bond on itsC-terminal side. A typical tandem mass spectrometer geometry is a triplequadrupole which comprises two quadrupole mass analysers separated by acollision chamber also a quadrupole. This collision quadrupole acts asan ion guide between the two mass analyser quadrupoles. A gas can beintroduced into the collision quadrupole to allow collision with the ionstream from the first mass analyser. The first mass analyser selectsions on the basis of their mass/charge ration which pass through thecollision cell where they fragment. The fragment ions are separated anddetected in the third quadrupole. Induced cleavage can be performed ingeometries other than tandem analysers. Ion trap mass spectrometers canpromote fragmentation through introduction of a gas into the trap itselfwith which trapped ions will collide. Ion traps generally contain a bathgas, such as helium but addition of neon for example, promotesfragmentation. Similarly photon induced fragmentation could be appliedto trapped ions. Another favorable geometry is a Quadrupole/OrthogonalTime of Flight tandem instrument where the high scanning rate of aquadrupole is coupled to the greater sensitivity of a reflectron TOFmass analyser to identify the products of fragmentation.

Conventional ‘sector’ instruments are another common geometry used intandem mass spectrometry. A sector mass analyser comprises two separate‘sectors’, an electric sector which focuses an ion beam leaving a sourceinto a stream of ions with the same kinetic energy using electricfields. The magnetic sector separates the ions on the basis of theirmass to generate a spectrum at a detector. For tandem mass spectrometrya two sector mass analyser of this kind can be used where the electricsector provide the first mass analyser stage, the magnetic sectorprovides the second mass analyser, with a collision cell placed betweenthe two sectors. Two complete sector mass analysers separated by acollision cell can also be used for analysis of mass tagged peptides.

Ion Traps

Ion Trap mass analysers are related to the quadrupole mass analysers.The ion trap generally has a 3 electrode construction—a cylindricalelectrode with ‘cap’ electrodes at each end forming a cavity. Asinusoidal radio frequency potential is applied to the cylindricalelectrode while the cap electrodes are biased with DC or AC potentials.Ions injected into the cavity are constrained to a stable circulartrajectory by the oscillating electric field of the cylindricalelectrode. However, for a given amplitude of the oscillating potential,certain ions will have an unstable trajectory and will be ejected fromthe trap. A sample of ions injected into the trap can be sequentiallyejected from the trap according to their mass/charge ratio by alteringthe oscillating radio frequency potential. The ejected ions can then bedetected allowing a mass spectrum to be produced.

Ion traps are generally operated with a small quantity of a ‘bath gas’,such as helium, present in the ion trap cavity. This increases both theresolution and the sensitivity of the device as the ions entering thetrap are essentially cooled to the ambient temperature of the bath gasthrough collision with the bath gas. Collisions both increase ionisationwhen a sample is introduced into the trap and dampen the amplitude andvelocity of ion trajectories keeping them nearer the centre of the trap.This means that when the oscillating potential is changed, ions whosetrajectories become unstable gain energy more rapidly, relative to thedamped circulating ions and exit the trap in a tighter bunch giving anarrower larger peaks.

Ion traps can mimic tandem mass spectrometer geometries, in fact theycan mimic multiple mass spectrometer geometries allowing complexanalyses of trapped ions. A single mass species from a sample can beretained in a trap, i.e. all other species can be ejected and then theretained species can be carefully excited by super-imposing a secondoscillating frequency on the first. The excited ions will then collidewith the bath gas and will fragment if sufficiently excited. Thefragments can then be analysed further. It is possible to retain afragment ion for further analysis by ejecting other ions and thenexciting the fragment ion to fragment. This process can be repeated foras long as sufficient sample exists to permit further analysis. Itshould be noted that these instruments generally retain a highproportion of fragment ions after induced fragmentation. Theseinstruments and FTICR mass spectrometers (discussed below) represent aform of temporally resolved tandem mass spectrometry rather thanspatially resolved tandem mass spectrometry which is found in linearmass spectrometers.

Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FTICR MS)

FTICR mass spectrometry has similar features to ion traps in that asample of ions is retained within a cavity but in FTICR MS the ions aretrapped in a high vacuum chamber by crossed electric and magneticfields. The electric field is generated by a pair of plate electrodesthat form two sides of a box. The box is contained in the field of asuperconducting magnet which in conjunction with the two plates, thetrapping plates, constrain injected ions to a circular trajectorybetween the trapping plates, perpendicular to the applied magneticfield. The ions are excited to larger orbits by applying a radio-frequency pulse to two ‘transmitter plates’ which form two furtheropposing sides of the box. The cycloidal motion of the ions generatecorresponding electric fields in the remaining two opposing sides of thebox which comprise the ‘receiver plates’. The excitation pulses exciteions to larger orbits which decay as the coherent motions of the ions islost through collisions, The corresponding signals detected by thereceiver plates are converted to a mass spectrum by Fourier Transform(FT) analysis.

For induced fragmentation experiments these instruments can perform in asimilar manner to an ion trap—all ions except a single species ofinterest can be ejected from the trap. A collision gas can be introducedinto the trap and fragmentation can be induced. The fragment ions can besubsequently analysed. Generally fragmentation products and bath gascombine to give poor resolution if analysed by FT analysis of signalsdetected by the ‘receiver plates’, however the fragment ions can beejected from the cavity and analysed in a tandem configuration with aquadrupole, for example.

EXAMPLES Example 1—Labelling conditions for Thiol and Epsilon aminogroup labelling

Since most proteins typically have one or more cysteine residues, whichmay be cross-linked to form disulphide bridges, and since thiol groupsof cysteine are the most reactive side-chains in a polypeptide, it isessential that protocols are found that block this functionality as wellas any free epsilon amino groups. The hindered Michael reagents used inthis invention will react readily with thiols as well as with epsilonamino groups and so both functionalities may be labelled in a singlereaction.

Alternatively the thiols may be labelled with a different reagent priorto labelling the epsilon amino groups with the hindered Michael reagentsof this invention.

Capping Thiol and Epsilon amino Groups with different tags

In this example salmon Calcitonin (10° nmol, Calbiochem), which has 2cysteine residues in a disulphide bridge, was dissolved in a denaturingbuffer comprising 2 M urea, 0.5 M thiourea in 10 mM sodium carbonate atpH 7.5 in the presence of 0.2 μM tris(carboxyethyl)phosphine (TCEP).TCEP reduces disulphide bridges. The reaction mixture also containediodoacetamnide (20 equivalent per thiol site, 400 nmol) which reactsreadily, with free thiols. This reaction was left for 90 min. at roomtemperature. The pH of the buffer was then raised to between 10 and 12by the addition of sodium hydroxide. Pyridyl propenyl sulphone was thenadded to the reaction to cap free lysine residues in Salmon Calcitonin.This peptide has 2 lysine residues. The reaction was then desalted(Oasis hydrophilic-lipophilic balance extraction cartridge, Waters) andanalysed by MALDI TOF mass spectrometry. The mass spectrum is shown inFIG. 7. As can be seen from this mass spectrum a number of differentspecies appear in the mass spectrum corresponding to different labelingproducts of the peptide. The two different labels give rise to differentcombinations of incomplete reactions.

Capping Thiol and Epsilon Amino Groups with the Saine Tag on one Peptide

In this Example, 10 nmol of human Calcitonin was dissolved in adenaturing buffer comprising 2 M urea, 0.5 M thiourea in 10 mM sodiumcarbonate at pH 7.5 in the presence of 0.2 μMtris(carboxyethyl)phosphine (TCEP). TCEP reduces disulphide bridges.This reaction was left for 30 minutes to allow complete reduction of alldisulphide bridges to take place. After the reduction reaction 40equivalents of pyridyl propenyl sulphone per reaction site, which wereassumed only to comprise epsilon amino groups and thiol groups, wasadded to the reaction mixture. This reaction was left for 90 min. atroom temperature at pH 8. The pH of the buffer was then raised tobetween 11-12 by the addition of sodium hydroxide. The reaction mixturewas left at the higher pH for 4 hours at room temperature to cap freelysine residues in the peptides. Unreacted tag was quenched with anexcess of lysine. The reaction was then desalted (Oasishydrophilic-lipophilic balance extraction cartridge, Waters) andanalysed by MALDI TOF mass spectrometry. The mass spectrum is shown inFIG. 8. As can be seen from this mass spectrum the number of differentspecies appearing in the mass spectrum corresponding to differentlabelling products of each peptide is much smaller than for the protocolusing two different tags for thiols and epsilon amino groups.

Capping Thiol and Epsilon Amino Groups with the same Tag on a Mixture ofPeptides

In this Example, a mixture of peptides (10 nmol of each) comprisingbeta-melanocyte stimulating hormone (β-MSH), alpha-melanocytestimulating hormone (α-MSH), Salmon Calcitonin and residues 1 to 24 ofadrenocorticotropic hormone (ACTH (1-24)) (all available fromSigma-Aldrich, Dorset, UK) were dissolved in a denaturing buffercomprising 2 M urea, 0.5 M thiourea in 10 mM sodium borate at pH 7.5 inthe presence of 0.2 RAM TCEP. This reaction was left for 30 minutes toallow complete reduction of all disulphide bridges to take place. Afterthe reduction reaction 40 equivalents of pyridyl propenyl sulphone perreaction site, which were assumed only to comprise epsilon amino groupsand thiol groups, was added to the reaction mixture. This reaction wasleft for 90 min. at room temperature at pH 8. The pH of the buffer wasthen raised to between 11-12 by the addition of sodium hydroxide. Thereaction mixture was left at the higher pH for 4 hours at roomtemperature to cap free lysine residues in the peptides. Unreacted tagwas quenched with an excess of lysine. The reaction was then desalted(Oasis hydrophilic-lipophilic balance extraction cartridge, Waters) andanalysed by MALDI TOF mass spectrometry. The mass spectrum is shown inFIG. 9. As can be seen from this mass spectrum the number of differentspecies appearing in the mass spectrum corresponding to differentlabelling products of each peptide is much smaller than for the protocolusing two different tags for thiols and epsilon amino groups.

Capping of Unblocked Alpha Amino Groups

Following the capping of the mixture of peptides above, the unblockedalpha-amino groups were blocked with acetic acid N-hydroxysuccinimideester. The thiol and epsilon amino capped peptides were exposed to 40equivalents of the active ester reagent per alpha amino group in thesame sodium borate buffer used previously at pH 11 for 2 hours at roomtemperature. The MALDI TOF mass spectrum of the products of thisreaction is shown in FIG. 10. As can be seen from this figure, only oneacetyl group reacts with each of the peptides that are expected toreact, i.e. all of the four peptides except α-MSH. This means that thecapped epsilon amino groups are resistant to reaction with the activeester reagent.

1. A method for characterising a polypeptide, which method comprises thesteps of: (a) optionally reducing cysteine disulphide bridges in thepolypeptide to form free thiols, and capping the free thiols; (b)cleaving the polypeptide with a sequence specific cleavage reagent toform peptide fragments; (c) optionally deactivating the cleavagereagent; (d) capping one or more ε-amino groups that are present with alysine reactive agent, wherein the lysine reactive agent comprises ahindered Michael reagent; (e) analysing peptide fragments by massspectrometry to form a mass fingerprint for the polypeptide; and (f)determining the identity of the polypeptide from the mass fingerprint.2. A method for characterising a population of polypeptides, whichmethod comprises the steps of: (a) optionally reducing cysteinedisulphide bridges in one or more polypeptides to form free thiols, andcapping the free thiols; (b) separating one or more polypeptides fromthe population; (c) cleaving one or more polypeptides with a sequencespecific cleavage reagent to form peptide fragments; (d) optionallydeactivating the cleavage reagent; (e) capping one or more E-aminogroups that are present with a lysine reactive agent, wherein the lysinereactive agent comprises a hindered Michael reagent; (f) analysing thepeptide fragments by mass spectrometry to form a mass fingerprint forone or more of the polypeptides; and (g) determining the identity of oneor more polypeptides from the mass fingerprint.
 3. A method forcomparing a plurality of samples, each sample comprising one or morepolypeptides, which method comprises the steps of: (a) optionallyreducing cysteine disulphide bridges and capping the free thiols in oneor more polypeptides from the samples; (b) separating one or morepolypeptides from each of the samples; (c) cleaving the polypeptideswith a sequence specific cleavage reagent to form peptide fragments; (d)optionally deactivating the cleavage reagent; (e) capping one or moreε-amino groups that are present with a lysine reactive agent, whereinthe lysine reactive agent comprises a hindered Michael reagent; (f)analysing the peptide fragments by mass spectrometry to form a massfingerprint for one or more polypeptides in the samples; and (g)determining the identity of one or more polypeptides in the samples fromone or more mass fingerprints.
 4. A method according to claim 1, whereinthe lysine-reactive agent is a labelled lysine-reactive agent.
 5. Amethod according to claim 3, for comparing a plurality of samples, eachsample comprising one or more polypeptides, which method comprises thesteps of: (a) optionally reducing cysteine disulphide bridges andcapping the free thiols in one or more polypeptides from the samples;(b) capping one or more ε-amino groups that are present in each samplewith a labelled lysine reactive agent; (c) pooling the samples; (d)separating one or more polypeptides from the pooled samples; (e)cleaving the polypeptides with a sequence specific cleavage reagent toform peptide fragments; (f) optionally deactivating the cleavagereagent; (g) analysing the peptide fragments by mass spectrometry toform a mass fingerprint for one or more polypeptides in the samples; and(h) determining the identity of one or more polypeptides in the samplesfrom one or more mass fingerprints. wherein the same label is employedfor polypeptides or peptides from the same sample, and different labelsare employed for polypeptides or peptides from different samples, suchthat the sample from which a polypeptide or peptide originates can bedetermined from its label.
 6. A method according to claim 1, wherein thesequence specific cleavage agent cleaves the one or more polypeptides onthe C-terminal side of a lysine residue.
 7. A method according to claim1, wherein the specific cleavage reagent comprises Lys-C or Trypsin. 8.A method according to claim 1, wherein the peptide fragments havingcapped ε-amino groups are removed by affinity capture, and wherein thelysine reactive agent comprises biotin.
 9. A method according to claim1, wherein the hindered Michael agent comprises a compound having thefollowing structure:

wherein X is an electron withdrawing group that is capable ofstabilising a negative charge; the R groups independently comprise ahydrogen, a halogen, an alkyl, an aryl, or an aromatic group with theproviso that at least one of the R groups comprises a stericallyhindering group; and the group R2 comprises a hydrogen, a halogen, ahydrocarbon group, an electron withdrawing group and/or a linker capableof attachment to an affinity capture functionality or a solid phasesupport.
 10. A method according to claim 9, wherein one R comprises amethyl or phenyl group.
 11. A method according to claim 9 wherein atleast one R comprises an electron withdrawing group.
 12. A methodaccording to claim 9, wherein at least one R comprises a cyclic orheterocylic aromatic ring or fused ring.
 13. A method according to claim9, wherein X comprises an —S0₂R¹ group, wherein R¹ comprises an alkylgroup or an aryl group, including aromatic groups cyclic groups, fusedcyclic groups, and heterocyclic groups.
 14. A method according to claim13, wherein R¹ comprises an electron withdrawing group.
 15. A methodaccording to claim 13, wherein the ring comprises a phenyl, pyridyl,naphthyl, quinolyl, pyrazine, pyrimidine or triazine ring structure. 16.A method according to claim 9 wherein the X group is substituted with anelectron withdrawing group.
 17. A method according to claim 16, whereinthe electron withdrawing group is selected from halogens, such asfluorine chlorine, bromine or iodine, and nitro and nitrile groups. 18.A method according to claim 9, wherein the X group comprises a structurecapable of promoting water solubility.
 19. A method according to claim1, wherein the polypeptide, population of polypeptides or samplescomprise a sub-cellular fraction.
 20. A method according to claim 1,which further comprises preparing the polypeptide, population ofpolypeptides or samples by liquid chromatography.
 21. A method forassaying for one or more specific target polypeptides in a test sample,which comprises performing a method according to claim 1, wherein thesequence of the target polypeptide is determined by assaying the one ormore mass fingerprints for a predetermined mass fingerprint specific tothe target polypeptide.
 22. A method for determining the expressionprofile of one or more samples, which method comprises characterisingone or more polypeptides from one or more samples, according to a methodas defined in claim
 1. 23. A method according to claim 22, which methodcomprises identifying the quantity of each of the polypeptides detectedby mass spectrometry.